Login / Signup
Optimal Regret Bounds for Selecting the State Representation in Reinforcement Learning.
Odalric-Ambrym Maillard
Phuong Nguyen
Ronald Ortner
Daniil Ryabko
Published in:
ICML (1) (2013)
Keyphrases
</>
reinforcement learning
regret bounds
state space
multi armed bandit
dynamic programming
optimal solution
machine learning
upper bound
closed form
learning theory