Agnostic $Q$-learning with Function Approximation in Deterministic Systems: Near-Optimal Bounds on Approximation Error and Sample Complexity.
Simon S. DuJason D. LeeGaurav MahajanRuosong WangPublished in: NeurIPS (2020)
Keyphrases
- function approximation
- approximation error
- sample complexity
- reinforcement learning
- temporal difference learning algorithms
- model free
- upper bound
- temporal difference learning
- lower bound
- radial basis function
- learning problems
- temporal difference
- supervised learning
- function approximators
- learning algorithm
- worst case
- high dimensional
- data sets
- sample size
- theoretical analysis
- generalization error
- dynamic programming
- prior knowledge
- training data
- image segmentation
- temporal difference methods