Reward-Biased Maximum Likelihood Estimation for Linear Stochastic Bandits.
Yu-Heng HungPing-Chun HsiehXi LiuP. R. KumarPublished in: AAAI (2021)
Keyphrases
- maximum likelihood estimation
- em algorithm
- maximum likelihood
- multi armed bandit
- parameter estimation
- regret bounds
- probability distribution
- stochastic systems
- expectation maximization
- boltzmann machine
- multivariate gaussian
- mixture of gaussians
- reinforcement learning
- machine learning
- conjugate gradient
- density function
- linear regression
- bayesian networks