Posterior Sampling for Reinforcement Learning Without Episodes.
Ian OsbandBenjamin Van RoyPublished in: CoRR (2016)
Keyphrases
- reinforcement learning
- markov chain monte carlo
- function approximation
- metropolis hastings
- random sampling
- posterior distribution
- state space
- monte carlo
- probability distribution
- multi agent
- learning algorithm
- sampling strategy
- event sequences
- reinforcement learning algorithms
- model free
- machine learning
- optimal control
- learning process
- gaussian process
- posterior probability
- temporal difference
- transfer learning
- sampling methods
- function approximators
- markov chain
- reinforcement learning methods
- importance sampling
- image reconstruction
- sampled data
- transition model
- supervised learning