Login / Signup
Policy Evaluation with Stochastic Gradient Estimation Techniques.
Yi Zhou
Michael C. Fu
Ilya O. Ryzhov
Published in:
WSC (2022)
Keyphrases
</>
gradient estimation
variance reduction
policy evaluation
monte carlo
sample size
importance sampling
markov chain
temporal difference
markov decision processes
confidence intervals
reinforcement learning
least squares
machine learning
probability distribution
model free