Accelerated and instance-optimal policy evaluation with linear function approximation.
Tianjiao LiGuanghui LanAshwin PananjadyPublished in: CoRR (2021)
Keyphrases
- function approximation
- policy evaluation
- temporal difference
- reinforcement learning
- function approximators
- model free
- td learning
- least squares
- radial basis function
- learning tasks
- semi parametric
- monte carlo
- dynamic programming
- markov decision processes
- optimal control
- state space
- policy iteration
- artificial neural networks
- optimal solution
- learning algorithm
- data mining
- markov chain
- step size