Login / Signup
Low Variance Off-policy Evaluation with State-based Importance Sampling.
David M. Bossens
Philip Thomas
Published in:
CoRR (2022)
Keyphrases
</>
importance sampling
monte carlo
policy evaluation
variance reduction
low variance
markov chain
kalman filter
least squares
markov chain monte carlo
temporal difference
learning algorithm
sample size
markov decision processes