RL with KL penalties is better viewed as Bayesian inference.
Tomasz KorbakEthan PerezChristopher L. BuckleyPublished in: CoRR (2022)
Keyphrases
- bayesian inference
- reinforcement learning
- hyperparameters
- prior information
- probabilistic model
- statistical inference
- variational inference
- gibbs sampler
- bayesian models
- variational bayes
- hierarchical bayesian
- hidden variables
- probabilistic modeling
- kullback leibler
- optimal policy
- learning algorithm
- bayesian model
- variational approximation
- probability distribution
- expectation propagation
- search algorithm
- information theoretic
- mixture model
- generative model
- particle filter