Convergence Properties of Policy Iteration.
Manuel S. SantosJohn RustPublished in: SIAM J. Control. Optim. (2004)
Keyphrases
- policy iteration
- markov decision processes
- stochastic approximation
- convergence rate
- model free
- fixed point
- sample path
- policy evaluation
- finite state
- optimal policy
- reinforcement learning
- least squares
- probability distribution
- image sequences
- state space
- temporal difference
- markov decision process
- pairwise
- machine learning