Improving Safety in Deep Reinforcement Learning using Unsupervised Action Planning.
Hao-Lun HsuQiuhua HuangSehoon HaPublished in: ICRA (2022)
Keyphrases
- reinforcement learning
- action selection
- reward shaping
- partially observable domains
- supervised learning
- action space
- temporal difference
- planning problems
- plan execution
- deep architectures
- derived predicates
- external events
- macro actions
- markov decision processes
- semi supervised
- concurrent actions
- state space
- deterministic domains
- transition model
- markov decision problems
- enforced hill climbing
- partially observable markov decision processes
- machine learning
- action selection mechanism
- state action
- complex domains
- continuous state
- reinforcement learning algorithms
- unsupervised learning
- markov decision process
- model free
- motion planning
- function approximation
- human actions
- heuristic search
- optimal policy
- ai planning
- sensing actions
- action sequences
- initial state