Concentration of Contractive Stochastic Approximation and Reinforcement Learning.

Siddharth Chandak Vivek S. Borkar

Published in: CoRR (2021)

Keyphrases

stochastic approximation
reinforcement learning
monte carlo
temporal difference learning
fixed point
policy iteration
function approximation
optimal solution
state space
optimal policy
reinforcement learning algorithms