V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control.
H. Francis SongAbbas AbdolmalekiJost Tobias SpringenbergAidan ClarkHubert SoyerJack W. RaeSeb NouryArun AhujaSiqi LiuDhruva TirumalaNicolas HeessDan BelovMartin A. RiedmillerMatthew M. BotvinickPublished in: CoRR (2019)
Keyphrases
- maximum a posteriori
- markov random field
- map estimation
- image reconstruction
- maximum likelihood
- expectation maximization
- bayesian framework
- em algorithm
- energy function
- action space
- ground truth
- hyperparameters
- edge preserving
- generalized gaussian
- high quality
- prior model
- posterior distribution
- prior information
- graphical models
- state space
- pairwise