Trial without Error: Towards Safe Reinforcement Learning via Human Intervention.

William Saunders Girish Sastry Andreas Stuhlmüller Owain Evans

Published in: CoRR (2017)

Keyphrases

reinforcement learning
error rate
machine learning
learning algorithm
function approximation
website
error bounds
linear complexity
state space
reinforcement learning algorithms
temporal difference
temporal difference learning
data sets
partially observable
model free
learning problems
markov decision processes
learning process
multi agent
training data