Evolution of reward functions for reinforcement learning.
Scott NiekumLee SpectorAndrew G. BartoPublished in: GECCO (Companion) (2011)
Keyphrases
- reward function
- reinforcement learning
- reinforcement learning algorithms
- state space
- markov decision processes
- policy search
- optimal policy
- inverse reinforcement learning
- markov decision process
- partially observable
- transition model
- function approximation
- transition probabilities
- learning agents
- model free
- multiple agents
- multi agent
- simple examples
- initially unknown
- state action
- state variables
- generative model
- machine learning
- control policy
- preference elicitation
- control policies
- markov decision problems
- action selection
- data mining