The Smoothed Complexity of Policy Iteration for Markov Decision Processes.
Miranda ChristMihalis YannakakisPublished in: STOC (2023)
Keyphrases
- markov decision processes
- policy iteration
- optimal policy
- sample path
- state space
- finite state
- dynamic programming
- approximate dynamic programming
- model free
- average reward
- factored mdps
- reinforcement learning
- markov decision process
- transition matrices
- infinite horizon
- fixed point
- decision problems
- planning under uncertainty
- reinforcement learning algorithms
- markov decision problems
- policy evaluation
- least squares
- state and action spaces
- markov games
- decision processes
- stochastic games
- average cost
- linear programming
- discounted reward
- partially observable
- finite horizon
- temporal difference