C
search
search
reviewers
reviewers
feeds
feeds
assignments
assignments
settings
logout
Low Variance Off-policy Evaluation with State-based Importance Sampling.
David M. Bossens
Philip Thomas
Published in:
CoRR (2022)
Keyphrases
</>
importance sampling
monte carlo
policy evaluation
variance reduction
low variance
markov chain
kalman filter
least squares
markov chain monte carlo
temporal difference
learning algorithm
sample size
markov decision processes