Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic.

Published in: CoRR (2016)

Keyphrases