Reinforcement learning from human reward: Discounting in episodic tasks.

W. Bradley Knox Peter Stone

Published in: RO-MAN (2012)

Keyphrases

reinforcement learning
function approximation
transfer learning
episodic memory
learning process
state space
complex domains
human operators
multi agent
dynamic programming
human users
eligibility traces
multi agent environments
temporal difference
human subjects
model free
reinforcement learning agents
partially observable environments
machine learning
real robot
reinforcement learning algorithms
optimal policy
learning algorithm