Benchmarks for Deep Off-Policy Evaluation.
Justin FuMohammad NorouziOfir NachumGeorge TuckerZiyu WangAlexander NovikovMengjiao YangMichael R. ZhangYutian ChenAviral KumarCosmin PaduraruSergey LevineThomas PainePublished in: ICLR (2021)
Keyphrases
- policy evaluation
- least squares
- temporal difference
- reinforcement learning
- monte carlo
- model free
- policy iteration
- markov decision processes
- function approximation
- variance reduction
- matrix inversion
- semi parametric
- statistical inference
- action selection
- reinforcement learning algorithms
- optimal policy
- decision making
- neural network
- gaussian process
- fixed point