Enhancing Policy Gradient with the Polyak Step-Size Adaption.
Yunxiang LiRui YuanChen FanMark SchmidtSamuel HorváthRobert M. GowerMartin TakácPublished in: CoRR (2024)
Keyphrases
- step size
- policy gradient
- gradient method
- convergence rate
- cost function
- convergence speed
- reinforcement learning
- faster convergence
- function approximation
- temporal difference
- reinforcement learning algorithms
- learning rate
- wavelet coefficients
- optimization methods
- approximation methods
- optimal control
- variance reduction
- feature vectors