Login / Signup
Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators.
Zaiwei Chen
Siva Theja Maguluri
Sanjay Shakkottai
Karthikeyan Shanmugam
Published in:
NeurIPS (2021)
Keyphrases
</>
finite sample
neural network
support vector machine svm
sample size
temporal difference