Login / Signup
Improved Regret Analysis for Variance-Adaptive Linear Bandits and Horizon-Free Linear Mixture MDPs.
Yeoneung Kim
Insoon Yang
Kwang-Sung Jun
Published in:
NeurIPS (2022)
Keyphrases
</>
linear systems
objective function
data analysis
upper bound
maximum likelihood
optimal policy
closed form
regret bounds