A Greedy Approximation of Bayesian Reinforcement Learning with Probably Optimistic Transition Model
Kenji KawaguchiMauricio Araya-LópezPublished in: CoRR (2013)
Keyphrases
- transition model
- bayesian reinforcement learning
- reinforcement learning
- markov decision problems
- optimal policy
- dynamic programming
- reward function
- queueing networks
- monte carlo tree search
- search algorithm
- linear programming
- approximation methods
- state space
- decision theoretic
- decision problems
- partially observable markov decision processes
- decision processes
- search space