Combining importance sampling and temporal difference control variates to simulate Markov Chains.

R. S. Randhawa Sandeep Juneja

Published in: ACM Trans. Model. Comput. Simul. (2004)

Keyphrases

markov chain
importance sampling
monte carlo
temporal difference
steady state
finite state
transition probabilities
reinforcement learning
state space
function approximation
evaluation function
markov processes
policy iteration
markov chain monte carlo
variance reduction
action selection
confidence intervals
step size
model free
kalman filter
dynamic programming
machine learning