Login / Signup
Off-Policy Fitted Q-Evaluation with Differentiable Function Approximators: Z-Estimation and Inference Theory.
Ruiqi Zhang
Xuezhou Zhang
Chengzhuo Ni
Mengdi Wang
Published in:
ICML (2022)
Keyphrases
</>
function approximators
function approximation
objective function
reinforcement learning
dynamic programming
bayesian networks
learning experience
monte carlo
dynamic bayesian networks