Login / Signup
Variational Regret Bounds for Reinforcement Learning.
Pratik Gajane
Ronald Ortner
Peter Auer
Published in:
CoRR (2019)
Keyphrases
</>
reinforcement learning
regret bounds
multi armed bandit
linear regression
image segmentation
lower bound
optical flow
state space
markov decision processes
online learning
bregman divergences
learning process
upper bound
least squares
maximum entropy