Evolution of reward functions for reinforcement learning.

Scott Niekum Lee Spector Andrew G. Barto

Published in: GECCO (Companion) (2011)

Keyphrases

reward function
reinforcement learning
reinforcement learning algorithms
state space
markov decision processes
policy search
optimal policy
inverse reinforcement learning
markov decision process
partially observable
transition model
function approximation
transition probabilities
learning agents
model free
multiple agents
multi agent
simple examples
initially unknown
state action
state variables
generative model
machine learning
control policy
preference elicitation
control policies
markov decision problems
action selection
data mining