Improved and Generalized Upper Bounds on the Complexity of Policy Iteration.
Bruno ScherrerPublished in: Math. Oper. Res. (2016)
Keyphrases
- upper bound
- policy iteration
- worst case
- lower bound
- markov decision processes
- model free
- optimal policy
- fixed point
- reinforcement learning
- sample path
- finite state
- temporal difference
- least squares
- sample size
- policy evaluation
- infinite horizon
- search algorithm
- model checking
- np hard
- special case
- artificial neural networks
- training set
- markov decision problems
- optimal solution