Login / Signup
On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method.
Junyu Zhang
Chengzhuo Ni
Zheng Yu
Csaba Szepesvári
Mengdi Wang
Published in:
NeurIPS (2021)
Keyphrases
</>
gradient method
convergence rate
policy gradient
actor critic
step size
negative matrix factorization
sample size
convergence speed
optimal policy
optimization methods
convex formulation
data sets
log likelihood function