Login / Signup
Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds.
Andrea Zanette
Emma Brunskill
Published in:
ICML (2019)
Keyphrases
</>
domain knowledge
regret bounds
reinforcement learning
upper bound
lower bound
function approximation
function approximators
optimal solution
state space
markov decision processes
machine learning
learning process
linear regression
multi armed bandit