Logarithmic Regret for Reinforcement Learning with Linear Function Approximation.
Jiafan HeDongruo ZhouQuanquan GuPublished in: CoRR (2020)
Keyphrases
- function approximation
- reinforcement learning
- function approximators
- temporal difference learning algorithms
- regret bounds
- temporal difference learning
- mountain car
- temporal difference
- state action space
- tile coding
- radial basis function
- learning tasks
- model free
- reinforcement learning algorithms
- learning algorithm
- machine learning
- state space
- temporal difference methods
- neural network
- td learning
- feature selection
- reward function
- learning problems
- markov decision processes
- optimal policy
- learning process
- multi agent