Login / Signup
An Off-policy Policy Gradient Theorem Using Emphatic Weightings.
Ehsan Imani
Eric Graves
Martha White
Published in:
NeurIPS (2018)
Keyphrases
</>
policy gradient
reinforcement learning
actor critic
parametric optimization
function approximation
gradient method
optimal control
model free reinforcement learning
approximation methods
variance reduction
convergence rate