Off-Policy Evaluation via the Regularized Lagrangian.
Mengjiao YangOfir NachumBo DaiLihong LiDale SchuurmansPublished in: CoRR (2020)
Keyphrases
- policy evaluation
- least squares
- reinforcement learning
- matrix inversion
- monte carlo
- model free
- temporal difference
- markov decision processes
- policy iteration
- variance reduction
- function approximation
- optimal solution
- semi parametric
- linear regression
- partially observable markov decision processes
- linear model
- optical flow
- statistical inference
- fixed point