Safe Reinforcement Learning via Shielding.
Mohammed AlshiekhRoderick BloemRüdiger EhlersBettina KönighoferScott NiekumUfuk TopcuPublished in: CoRR (2017)
Keyphrases
- reinforcement learning
- function approximation
- state space
- model free
- optimal policy
- data sets
- temporal difference
- direct policy search
- partially observable domains
- markov decision processes
- dynamic programming
- action selection
- multi agent
- reinforcement learning algorithms
- control problems
- function approximators
- learning algorithm
- artificial intelligence
- robotic control
- relational reinforcement learning
- multi agent reinforcement learning
- stochastic approximation
- learning agents
- control policy
- markov decision process
- mobile robot