Safe Reinforcement Learning with Dead-Ends Avoidance and Recovery.

Xiao Zhang Hai Zhang Hongtu Zhou Chang Huang Di Zhang Chen Ye Junqiao Zhao

Published in: CoRR (2023)

Keyphrases

reinforcement learning
dead ends
path planning
search tree
learning algorithm
state space
multi agent
optimal policy
heuristic functions
machine learning
heuristic function
path finding
dynamic programming
orders of magnitude
search algorithm
markov decision processes
branch and bound algorithm