Login / Signup
Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic.
Shixiang Gu
Timothy P. Lillicrap
Zoubin Ghahramani
Richard E. Turner
Sergey Levine
Published in:
ICLR (2017)
Keyphrases
</>
policy gradient
actor critic
gradient method
neural network