Bayesian Nonparametric Multi-Optima Policy Search in Reinforcement Learning.
Danilo BrunoSylvain CalinonDarwin G. CaldwellPublished in: AAAI (2013)
Keyphrases
- policy search
- reinforcement learning
- continuous state
- reinforcement learning algorithms
- reward function
- inverse reinforcement learning
- dynamic programming
- state space
- markov decision processes
- function approximation
- partially observable markov decision processes
- optimal solution
- transfer learning
- learning problems
- model free
- policy gradient
- sufficient conditions
- supervised learning
- function approximators
- machine learning