Login / Signup
Improved Bayesian Regret Bounds for Thompson Sampling in Reinforcement Learning.
Ahmadreza Moradipari
Mohammad Pedramfar
Modjtaba Shokrian Zini
Vaneet Aggarwal
Published in:
CoRR (2023)
Keyphrases
</>
reinforcement learning
multi armed bandit
regret bounds
markov chain monte carlo
machine learning
bayesian inference
random sampling
bayesian networks
lower bound
learning process
state space
worst case
sample size
markov decision processes
learning problems