Provably Efficient Q-learning with Function Approximation via Distribution Shift Error Checking Oracle.
Simon S. DuYuping LuoRuosong WangHanrui ZhangPublished in: CoRR (2019)
Keyphrases
- function approximation
- reinforcement learning
- temporal difference learning
- temporal difference learning algorithms
- tile coding
- model free
- radial basis function
- learning tasks
- mountain car
- state action space
- temporal difference
- reinforcement learning algorithms
- function approximators
- td learning
- support vector
- learning experience
- supervised learning
- machine learning