Off-Policy Evaluation via the Regularized Lagrangian.
Mengjiao YangOfir NachumBo DaiLihong LiDale SchuurmansPublished in: NeurIPS (2020)
Keyphrases
- policy evaluation
- least squares
- reinforcement learning
- matrix inversion
- temporal difference
- monte carlo
- markov decision processes
- policy iteration
- model free
- linear regression
- optimal solution
- variance reduction
- function approximation
- linear model
- semi parametric
- reinforcement learning algorithms
- feature selection
- markov decision problems
- optical flow
- image sequences