Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic.

Published in: ICLR (2017)

Keyphrases