Bayesian Nonparametric Multi-Optima Policy Search in Reinforcement Learning.

Danilo Bruno Sylvain Calinon Darwin G. Caldwell

Published in: AAAI (2013)

Keyphrases

policy search
reinforcement learning
continuous state
reinforcement learning algorithms
reward function
inverse reinforcement learning
dynamic programming
state space
markov decision processes
function approximation
partially observable markov decision processes
optimal solution
transfer learning
learning problems
model free
policy gradient
sufficient conditions
supervised learning
function approximators
machine learning