Variance-Aware Off-Policy Evaluation with Linear Function Approximation.
Yifei MinTianhao WangDongruo ZhouQuanquan GuPublished in: NeurIPS (2021)
Keyphrases
- function approximation
- policy evaluation
- temporal difference
- variance reduction
- reinforcement learning
- model free
- td learning
- function approximators
- monte carlo
- least squares
- reinforcement learning algorithms
- policy iteration
- learning tasks
- radial basis function
- semi parametric
- sample size
- markov decision processes
- density estimation
- importance sampling
- learning process
- feature extraction
- feature selection