Recurrent Natural Policy Gradient for POMDPs.

Semih Cayci Atilla Eryilmaz

Published in: CoRR (2024)

Keyphrases

policy gradient
gradient ascent
actor critic
function approximation
reinforcement learning
gradient method
policy gradient methods
policy search
partially observable markov decision processes
reinforcement learning algorithms
variance reduction
optimal control
approximation methods
reinforcement learning methods
single agent
model free reinforcement learning
average reward
state action
learning algorithm
model free
optimal policy
dynamic programming
multi agent systems