Online Bootstrap Inference For Policy Evaluation in Reinforcement Learning.

Pratik Ramprasad Yuantong Li Zhuoran Yang Zhaoran Wang Will Wei Sun Guang Cheng

Published in: CoRR (2021)

Keyphrases

policy evaluation
reinforcement learning
temporal difference
least squares
model free
function approximation
policy iteration
monte carlo
markov decision processes
statistical inference
td learning
optimal policy
state space
variance reduction
dynamic programming
reinforcement learning algorithms
semi parametric
evaluation function
bayesian networks
bayesian inference
multi agent
decision problems
step size
cross validation
partially observable
regression model
markov decision problems
upper bound