Co-evolution of Shaping Rewards and Meta-Parameters in Reinforcement Learning.
Stefan ElfwingEiji UchibeKenji DoyaHenrik I. ChristensenPublished in: Adapt. Behav. (2008)
Keyphrases
- supervised learning
- reinforcement learning
- reward shaping
- function approximation
- learning algorithm
- machine learning
- learning problems
- temporal difference
- state space
- markov decision processes
- parameter estimation
- hidden state
- optimal policy
- transfer learning
- measured data
- meta level
- multi agent
- multi agent reinforcement learning
- markov decision problems
- complex domains
- reward function
- optimal control
- parameter values
- expectation maximization
- maximum likelihood
- dynamic programming