Posterior Sampling for Deep Reinforcement Learning.
Remo SassoMichelangelo ConservaPaulo E. RauberPublished in: ICML (2023)
Keyphrases
- reinforcement learning
- markov chain monte carlo
- metropolis hastings
- function approximation
- monte carlo
- optimal policy
- posterior distribution
- probabilistic model
- sampling methods
- model free
- random sampling
- robotic control
- deep learning
- markov decision processes
- multi agent
- state space
- machine learning
- reinforcement learning algorithms
- posterior probability
- parameter space
- learning problems
- markov chain
- probability distribution
- optimal control
- bayesian framework
- sample size
- class probabilities
- temporal difference learning
- proposal distribution
- supervised learning