Sample-Efficient Reinforcement Learning for Linearly-Parameterized MDPs with a Generative Model.
Bingyan WangYuling YanJianqing FanPublished in: NeurIPS (2021)
Keyphrases
- generative model
- reinforcement learning
- markov decision processes
- probabilistic model
- state space
- bayesian framework
- em algorithm
- reward function
- discriminative models
- function approximation
- dynamic programming
- discriminative learning
- optimal policy
- prior knowledge
- machine learning
- latent dirichlet allocation
- policy iteration
- partially observable
- policy search
- topic models
- learning process
- feature space
- e learning
- learning algorithm