Login / Signup
Near-optimal Regret Bounds for Reinforcement Learning.
Peter Auer
Thomas Jaksch
Ronald Ortner
Published in:
NIPS (2008)
Keyphrases
</>
reinforcement learning
regret bounds
multi armed bandit
state space
model free
optimal policy
markov decision processes
machine learning
learning algorithm
lower bound
online learning
special case
learning problems
maximum entropy
temporal difference