Login / Signup
Asymptotically Unbiased Off-Policy Policy Evaluation when Reusing Old Data in Nonstationary Environments.
Vincent Liu
Yash Chandak
Philip S. Thomas
Martha White
Published in:
AISTATS (2023)
Keyphrases
</>
non stationary
training data
reinforcement learning
cost function
machine learning
least squares
statistical inference