Improving Safety in Deep Reinforcement Learning using Unsupervised Action Planning.
Hao-Lun HsuQiuhua HuangSehoon HaPublished in: CoRR (2021)
Keyphrases
- action selection
- reinforcement learning
- partially observable domains
- reward shaping
- supervised learning
- action selection mechanism
- derived predicates
- action space
- planning problems
- temporal difference
- state space
- optimal policy
- markov decision processes
- function approximation
- semi supervised
- partially observable
- macro actions
- unsupervised learning
- machine learning
- action sequences
- search space
- dynamic programming
- decision theoretic planning
- partially observable markov decision processes
- markov decision problems
- action models
- deep architectures
- external events
- deterministic domains
- multi agent
- motion planning
- dynamic environments
- heuristic search