Login / Signup
Reinforcement Learning with a Corrupted Reward Channel.
Tom Everitt
Victoria Krakovna
Laurent Orseau
Shane Legg
Published in:
IJCAI (2017)
Keyphrases
</>
reinforcement learning
function approximation
state space
model free
multi channel
eligibility traces
multi agent
markov decision processes
reinforcement learning algorithms
machine learning
dynamic programming
average reward
temporal difference
learning algorithm
action selection
communication channels
noise free
transfer learning
reinforcement learning methods
partially observable environments
learning classifier systems
reward function
learning problems
markov decision process
function approximators
multiple access
supervised learning
total reward