Login / Signup
A Unified Off-Policy Evaluation Approach for General Value Function.
Tengyu Xu
Zhuoran Yang
Zhaoran Wang
Yingbin Liang
Published in:
CoRR (2021)
Keyphrases
</>
policy evaluation
least squares
special case
cost function
sufficient conditions
model free
temporal difference
learning algorithm
reinforcement learning
support vector
monte carlo
markov decision processes