Safe Reinforcement Learning Using Advantage-Based Intervention.
Nolan WagenerByron BootsChing-An ChengPublished in: CoRR (2021)
Keyphrases
- reinforcement learning
- function approximation
- optimal policy
- markov decision processes
- learning process
- evolutionary algorithm
- mobile robot
- state space
- temporal difference
- model free
- supervised learning
- direct policy search
- temporal difference learning
- real robot
- reinforcement learning algorithms
- action selection
- database
- similarity measure
- case study
- genetic algorithm
- machine learning