Provable benefits of general coverage conditions in efficient online RL with function approximation.
Fanghui LiuLuca VianoVolkan CevherPublished in: CoRR (2023)
Keyphrases
- function approximation
- reinforcement learning
- tile coding
- temporal difference learning
- temporal difference
- model free
- learning tasks
- radial basis function
- temporal difference learning algorithms
- reinforcement learning algorithms
- exploration exploitation tradeoff
- neural network
- function approximators
- td learning
- learning environment
- state space
- dynamic programming
- artificial neural networks