Bilinear Exponential Family of MDPs: Frequentist Regret Bound with Tractable Exploration & Planning.
Reda OuhammaDebabrota BasuOdalric MaillardPublished in: AAAI (2023)
Keyphrases
- exponential family
- regret bounds
- bregman divergences
- density estimation
- maximum likelihood
- markov decision processes
- log likelihood
- closed form
- graphical models
- missing values
- statistical models
- mixture model
- reinforcement learning
- order statistics
- state space
- variational methods
- hidden variables
- probability density function
- kl divergence
- belief propagation
- statistical model
- least squares