Safe Reinforcement Learning via Shielding.

Mohammed Alshiekh Roderick Bloem Rüdiger Ehlers Bettina Könighofer Scott Niekum Ufuk Topcu

Published in: CoRR (2017)

Keyphrases

reinforcement learning
function approximation
state space
model free
optimal policy
data sets
temporal difference
direct policy search
partially observable domains
markov decision processes
dynamic programming
action selection
multi agent
reinforcement learning algorithms
control problems
function approximators
learning algorithm
artificial intelligence
robotic control
relational reinforcement learning
multi agent reinforcement learning
stochastic approximation
learning agents
control policy
markov decision process
mobile robot