Rates of Convergence in the Central Limit Theorem for Markov Chains, with an Application to TD Learning.
R. SrikantPublished in: CoRR (2024)
Keyphrases
- markov chain
- central limit theorem
- rates of convergence
- temporal difference
- monte carlo
- steady state
- heavy traffic
- probability distribution
- average reward
- finite state
- reinforcement learning
- state space
- evaluation function
- transition probabilities
- function approximation
- expectation maximization
- policy iteration
- importance sampling
- model free
- supervised learning
- machine learning
- learning tasks
- single server
- dynamic programming
- search space