Multi-Bellman operator for convergence of Q-learning with linear function approximation.
Diogo S. CarvalhoPedro A. SantosFrancisco S. MeloPublished in: CoRR (2023)
Keyphrases
- function approximation
- temporal difference learning algorithms
- temporal difference learning
- function approximators
- reinforcement learning
- tile coding
- model free
- temporal difference
- learning tasks
- radial basis function
- state action space
- state action
- linear program
- actor critic
- convergence speed
- reinforcement learning algorithms
- learning styles
- support vector machine
- pattern recognition