Finding Exploratory Rewards by Embodied Evolution and Constrained Reinforcement Learning in the Cyber Rodents.

Eiji Uchibe Kenji Doya

Published in: ICONIP (2) (2007)

Keyphrases

reinforcement learning
markov decision processes
function approximation
state space
model free
temporal difference
dynamic programming
supervised learning
optimal policy
reward shaping
data sets
average reward
action selection
multi agent
machine learning
neural network
optimal control
learning problems
evolutionary algorithm
reinforcement learning algorithms
partially observable
learning agent
control policy
sensory inputs