Finding intrinsic rewards by embodied evolution and constrained reinforcement learning.
Eiji UchibeKenji DoyaPublished in: Neural Networks (2008)
Keyphrases
- reinforcement learning
- markov decision processes
- function approximation
- learning algorithm
- machine learning
- optimal control
- model free
- reinforcement learning algorithms
- partially observable
- multi agent
- action selection
- reward shaping
- dynamic programming
- state space
- sufficient conditions
- optimal policy
- complex domains
- policy iteration