Fast Two-Time-Scale Stochastic Gradient Method with Applications in Reinforcement Learning.
Sihan ZengThinh T. DoanPublished in: CoRR (2024)
Keyphrases
- gradient method
- reinforcement learning
- actor critic
- policy gradient
- convergence rate
- step size
- convex formulation
- function approximation
- optimization methods
- negative matrix factorization
- markov decision processes
- temporal difference
- information retrieval systems
- state space
- natural gradient learning
- convergence speed
- data mining
- optimal control
- reinforcement learning algorithms
- optimal policy
- approximate dynamic programming