An Expectation Maximization Algorithm for Continuous Markov Decision Processes with Arbitrary Reward.
Matthew HoffmanNando de FreitasArnaud DoucetJan PetersPublished in: AISTATS (2009)
Keyphrases
- expectation maximization
- markov decision processes
- dynamic programming
- average reward
- em algorithm
- k means
- learning algorithm
- computational complexity
- model based reinforcement learning
- monte carlo
- probabilistic model
- search space
- reinforcement learning
- state space
- np hard
- finite state
- decision theoretic planning
- reachability analysis