On The Convergence Of Policy Iteration-Based Reinforcement Learning With Monte Carlo Policy Evaluation.
Anna WinnickiR. SrikantPublished in: CoRR (2023)
Keyphrases
- policy evaluation
- monte carlo
- policy iteration
- stochastic approximation
- temporal difference
- reinforcement learning
- matrix inversion
- variance reduction
- importance sampling
- convergence rate
- markov chain
- markov decision processes
- model free
- reinforcement learning algorithms
- sample path
- optimal policy
- least squares
- decision making
- particle filter
- machine learning
- markov chain monte carlo
- confidence intervals
- convergence speed
- fixed point
- markov decision problems
- active learning