Sample-based Distributional Policy Gradient.

Rahul Singh Keuntaek Lee Yongxin Chen

Published in: L4DC (2022)

Keyphrases

policy gradient
reinforcement learning
actor critic
parametric optimization
optimal control
sample size
function approximation
gradient method
variance reduction
model free reinforcement learning
reinforcement learning algorithms