Accelerating Policy Gradient by Estimating Value Function from Prior Computation in Deep Reinforcement Learning.

Md. Masudur Rahman Yexiang Xue

Published in: CoRR (2023)

Keyphrases

policy gradient
reinforcement learning
actor critic
function approximation
reinforcement learning algorithms
optimal control
policy search
policy gradient methods
gradient method
model free reinforcement learning
function approximators
variance reduction
state action
state space
neural network
single agent
control problems
temporal difference
model free
machine learning