Login / Signup
Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic.
Shixiang Gu
Timothy P. Lillicrap
Zoubin Ghahramani
Richard E. Turner
Sergey Levine
Published in:
CoRR (2016)
Keyphrases
</>
policy gradient
actor critic
neural network
gradient method
model selection
mathematical model
function approximation
optimal control