Login / Signup
Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds.
Andrea Zanette
Emma Brunskill
Published in:
CoRR (2019)
Keyphrases
</>
domain knowledge
regret bounds
reinforcement learning
upper bound
lower bound
function approximation
function approximators
multi armed bandit
online learning
linear regression
pairwise
state space
model free