BelMan: Bayesian Bandits on the Belief-Reward Manifold.
Debabrota BasuPierre SenellartStéphane BressanPublished in: CoRR (2018)
Keyphrases
- multi armed bandit
- dempster shafer
- reinforcement learning
- bayesian networks
- multi armed bandits
- belief functions
- manifold learning
- low dimensional
- bandit problems
- stochastic systems
- posterior distribution
- bayesian inference
- belief revision
- long run
- bayesian learning
- euclidean space
- maximum likelihood
- evidential reasoning
- bayesian estimation
- bayesian methods
- parameter space
- average reward
- belief state
- reward function
- knowledge base