Login / Signup
Nearly Minimax-Optimal Regret for Linearly Parameterized Bandits.
Yingkai Li
Yining Wang
Yuan Zhou
Published in:
COLT (2019)
Keyphrases
</>
worst case
regret bounds
minimax regret
multi armed bandit
dynamic programming
optimal solution
lower bound
closed form
evaluation function
learning algorithm
reinforcement learning
active learning
upper bound
optimal strategy
multi armed bandits