The Collusion of Memory and Nonlinearity in Stochastic Approximation With Constant Stepsize.
Dongyan HuoYixuan ZhangYudong ChenQiaomin XiePublished in: CoRR (2024)
Keyphrases
- stochastic approximation
- step size
- monte carlo
- approximate dynamic programming
- convergence rate
- policy iteration
- temporal difference
- cost function
- search direction
- convergence speed
- markov decision processes
- faster convergence
- reinforcement learning
- temporal difference learning
- quasi newton
- particle swarm optimization
- theoretical guarantees
- supervised learning
- objective function