Towards Provable Log Density Policy Gradient.

Pulkit Katdare Anant Joshi Katherine Rose Driggs-Campbell

Published in: CoRR (2024)

Keyphrases

policy gradient
parametric optimization
function approximation
reinforcement learning
actor critic
gradient method
optimal control
reinforcement learning algorithms
model free reinforcement learning
neural network
approximation methods
variance reduction
control problems
function approximators
average reward