Policy Learning for Robust Markov Decision Process with a Mismatched Generative Model.
Jialian LiTongzheng RenDong YanHang SuJun ZhuPublished in: CoRR (2022)
Keyphrases
- generative model
- markov decision process
- learning algorithm
- prior knowledge
- learning process
- discriminative learning
- probabilistic model
- reinforcement learning
- hidden variables
- inverse reinforcement learning
- reward function
- optimal policy
- topic models
- bayesian framework
- em algorithm
- image processing
- learning tasks
- supply chain
- semi supervised
- learned models
- active learning