Login / Signup
Safe Policy Improvement by Minimizing Robust Baseline Regret.
Mohammad Ghavamzadeh
Marek Petrik
Yinlam Chow
Published in:
NIPS (2016)
Keyphrases
</>
lower bound
online learning
optimal policy
error reduction
relative improvement
e learning
image sequences
particle filter
computationally efficient
partial occlusion
robust estimation