Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation.
Yaqi DuanMengdi WangPublished in: CoRR (2020)
Keyphrases
- function approximation
- policy evaluation
- temporal difference
- reinforcement learning
- td learning
- model free
- function approximators
- evaluation function
- radial basis function
- dynamic programming
- least squares
- learning tasks
- markov decision processes
- monte carlo
- linear model
- reinforcement learning algorithms
- optimal solution
- finite state
- feature extraction
- variance reduction
- semi parametric
- policy iteration
- control strategy
- markov chain