Sharp high-probability sample complexities for policy evaluation with linear function approximation.
Gen LiWeichen WuYuejie ChiCong MaAlessandro RinaldoYuting WeiPublished in: CoRR (2023)
Keyphrases
- function approximation
- policy evaluation
- temporal difference
- reinforcement learning
- model free
- function approximators
- td learning
- learning tasks
- radial basis function
- least squares
- semi parametric
- policy iteration
- sample size
- linear model
- markov decision processes
- reinforcement learning algorithms
- monte carlo
- state space
- cost function
- active learning
- multi agent
- neural network