Login / Signup
Regret Bounds and Reinforcement Learning Exploration of EXP-based Algorithms.
Mengfan Xu
Diego Klabjan
Published in:
CoRR (2020)
Keyphrases
</>
reinforcement learning
learning algorithm
computational complexity
lower bound
multi armed bandit
mutual information
policy iteration