Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation.
Yaqi DuanZeyu JiaMengdi WangPublished in: ICML (2020)
Keyphrases
- function approximation
- policy evaluation
- temporal difference
- reinforcement learning
- td learning
- function approximators
- model free
- least squares
- radial basis function
- learning tasks
- semi parametric
- monte carlo
- policy iteration
- markov decision processes
- neural network
- dynamic programming
- evaluation function
- optimal solution
- cost function
- artificial neural networks
- average reward
- policy gradient
- e learning
- learning algorithm