Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action Space.

Anas Barakat Ilyas Fatkhullin Niao He

Published in: CoRR (2023)

Keyphrases

reinforcement learning
variance reduction
state action space
function approximation
model free
sample size
monte carlo
policy gradient
text categorization
markov decision processes