Login / Signup
Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines.
Cathy Wu
Aravind Rajeswaran
Yan Duan
Vikash Kumar
Alexandre M. Bayen
Sham M. Kakade
Igor Mordatch
Pieter Abbeel
Published in:
CoRR (2018)
Keyphrases
</>
variance reduction
policy gradient
monte carlo
actor critic
sample size
importance sampling
reinforcement learning
confidence intervals
naive bayes classifier
gradient method
state action
machine learning
upper bound