Login / Signup
On-Line Policy Gradient Estimation with Multi-Step Sampling.
Yan-Jie Li
Fang Cao
Xi-Ren Cao
Published in:
Discret. Event Dyn. Syst. (2010)
Keyphrases
</>
multi step
gradient estimation
variance reduction
sample size
policy gradient
monte carlo
knn
optimal policy
k nearest neighbor
actor critic
importance sampling
reinforcement learning
nearest neighbor
optimal control
long run