Login / Signup
Safe Policy Improvement with an Estimated Baseline Policy.
Thiago D. Simão
Romain Laroche
Rémi Tachet des Combes
Published in:
CoRR (2019)
Keyphrases
</>
optimal policy
asymptotically optimal
databases
data mining
data sets
learning algorithm
data structure
significant improvement
motion estimation
infinite horizon