Benchmarks for Deep Off-Policy Evaluation.
Justin FuMohammad NorouziOfir NachumGeorge TuckerZiyu WangAlexander NovikovMengjiao YangMichael R. ZhangYutian ChenAviral KumarCosmin PaduraruSergey LevineTom Le PainePublished in: CoRR (2021)
Keyphrases
- policy evaluation
- least squares
- monte carlo
- reinforcement learning
- temporal difference
- model free
- matrix inversion
- variance reduction
- policy iteration
- function approximation
- markov decision processes
- semi parametric
- statistical inference
- neural network
- markov chain
- fixed point
- graphical models
- state space
- reinforcement learning algorithms
- importance sampling
- training set