KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal.
Tadashi KozunoWenhao YangNino VieillardToshinori KitamuraYunhao TangJincheng MeiPierre MénardMohammad Gheshlaghi AzarMichal ValkoRémi MunosOlivier PietquinMatthieu GeistCsaba SzepesváriPublished in: CoRR (2022)
Keyphrases
- generative model
- probabilistic model
- bayesian framework
- mixture model
- discriminative learning
- reinforcement learning
- prior knowledge
- em algorithm
- semi supervised
- discriminative models
- posterior probability
- dynamic programming
- worst case
- topic models
- learning algorithm
- multiscale
- image processing
- latent dirichlet allocation
- expectation maximization
- evaluation function
- kullback leibler divergence
- machine learning
- control policy
- data sets
- optimal policy
- learned models