Posterior Sampling for Large Scale Reinforcement Learning.
Georgios TheocharousZheng WenYasin Abbasi-YadkoriNikos VlassisPublished in: CoRR (2017)
Keyphrases
- reinforcement learning
- markov chain monte carlo
- real life
- small scale
- real world
- learning algorithm
- posterior distribution
- probabilistic model
- probability distribution
- metropolis hastings
- function approximation
- monte carlo
- posterior probability
- state space
- optimal control
- neural network
- sample size
- bayesian framework
- markov chain
- dynamic programming
- gaussian process
- multi agent
- temporal difference learning
- policy search