Identifying Policy Gradient Subspaces.

Jan Schneider Pierre Schumacher Simon Guist Le Chen Daniel F. B. Häufle Bernhard Schölkopf Dieter Büchler

Published in: CoRR (2024)

Keyphrases

policy gradient
actor critic
parametric optimization
high dimensional data
principal component analysis
reinforcement learning
neural network
high dimensional
function approximation
optimal control
average reward
gradient method