On The Convergence Of Policy Iteration-Based Reinforcement Learning With Monte Carlo Policy Evaluation.
Anna WinnickiR. SrikantPublished in: AISTATS (2023)
Keyphrases
- policy evaluation
- monte carlo
- stochastic approximation
- policy iteration
- temporal difference
- reinforcement learning
- matrix inversion
- variance reduction
- markov decision processes
- markov chain
- convergence rate
- least squares
- model free
- optimal policy
- importance sampling
- reinforcement learning algorithms
- fixed point
- finite state
- particle filter
- convergence speed
- machine learning