Finding Exploratory Rewards by Embodied Evolution and Constrained Reinforcement Learning in the Cyber Rodents.
Eiji UchibeKenji DoyaPublished in: ICONIP (2) (2007)
Keyphrases
- reinforcement learning
- markov decision processes
- function approximation
- state space
- model free
- temporal difference
- dynamic programming
- supervised learning
- optimal policy
- reward shaping
- data sets
- average reward
- action selection
- multi agent
- machine learning
- neural network
- optimal control
- learning problems
- evolutionary algorithm
- reinforcement learning algorithms
- partially observable
- learning agent
- control policy
- sensory inputs