Reinforcement learning from human reward: Discounting in episodic tasks.
W. Bradley KnoxPeter StonePublished in: RO-MAN (2012)
Keyphrases
- reinforcement learning
- function approximation
- transfer learning
- episodic memory
- learning process
- state space
- complex domains
- human operators
- multi agent
- dynamic programming
- human users
- eligibility traces
- multi agent environments
- temporal difference
- human subjects
- model free
- reinforcement learning agents
- partially observable environments
- machine learning
- real robot
- reinforcement learning algorithms
- optimal policy
- learning algorithm