Online Expectation Maximization for Reinforcement Learning in POMDPs.
Miao LiuXuejun LiaoLawrence CarinPublished in: IJCAI (2013)
Keyphrases
- reinforcement learning
- expectation maximization
- em algorithm
- function approximation
- online learning
- markov decision processes
- continuous state
- state space
- optimal policy
- balancing exploration and exploitation
- partially observable markov decision processes
- learning algorithm
- model free
- partially observable
- policy search
- generative model
- mixture model
- neural network
- markov decision process
- machine learning
- multi agent
- probabilistic model
- partial observability
- reinforcement learning algorithms
- dynamic programming
- maximum likelihood
- belief state
- temporal difference
- learning process
- gaussian mixture model
- control policies
- belief space
- real time
- parameter estimation