Reinforcement Learning Using Expectation Maximization Based Guided Policy Search for Stochastic Dynamics.
Prakash MallickZhiyong ChenMohsen ZamaniPublished in: CoRR (2020)
Keyphrases
- policy search
- reinforcement learning
- expectation maximization
- em algorithm
- continuous state
- reinforcement learning algorithms
- continuous action
- control policies
- dynamic programming
- generative model
- function approximation
- probabilistic model
- dynamical systems
- state space
- maximum likelihood
- partially observable markov decision processes
- function approximators
- policy gradient
- monte carlo methods
- temporal difference
- reward function
- state dependent
- markov decision problems
- monte carlo
- action selection
- model free
- decision theoretic
- optimal policy
- image segmentation