Information Directed Reward Learning for Reinforcement Learning.

David Lindner Matteo Turchetta Sebastian Tschiatschek Kamil Ciosek Andreas Krause

Published in: NeurIPS (2021)

Keyphrases

reinforcement learning
learning algorithm
learning process
information sources
end users
function approximation
prior knowledge
information extraction
supervised learning
online learning
eligibility traces
dynamic programming
state space
user interaction
markov decision processes
reinforcement learning algorithms
robot control
learning agents
temporal difference learning
active exploration