Offline Goal-Conditioned Reinforcement Learning for Safety-Critical Tasks with Recovery Policy.
Chenyang CaoZichen YanRenhao LuJunbo TanXueqian WangPublished in: CoRR (2024)
Keyphrases
- safety critical
- reinforcement learning
- optimal policy
- real time
- formal methods
- fault tolerant
- agent learns
- embedded systems
- partially observable
- markov decision process
- nuclear power plant
- safety analysis
- state space
- reward function
- machine learning
- decision makers
- markov decision processes
- low cost
- agent architecture
- artificial intelligence
- learning algorithm