Constant Stepsize Q-learning: Distributional Convergence, Bias and Extrapolation.
Yixuan ZhangQiaomin XiePublished in: CoRR (2024)
Keyphrases
- step size
- convergence rate
- convergence speed
- learning rate
- faster convergence
- quasi newton
- stochastic approximation
- reinforcement learning
- temporal difference
- approximate dynamic programming
- cost function
- policy iteration
- cooperative
- learning algorithm
- multi agent
- search direction
- function approximation
- state space
- global convergence
- stochastic shortest path
- particle swarm optimization
- mutation operator
- global optimum
- model free
- genetic algorithm
- convergence analysis
- action selection
- primal dual
- pso algorithm
- differential evolution
- wavelet transform
- learning tasks
- simulated annealing
- estimation error
- objective function
- image processing
- optimal policy