Improved and Generalized Upper Bounds on the Complexity of Policy Iteration.
Bruno ScherrerPublished in: NIPS (2013)
Keyphrases
- upper bound
- policy iteration
- worst case
- markov decision processes
- lower bound
- model free
- sample path
- computational complexity
- least squares
- reinforcement learning
- temporal difference
- fixed point
- policy evaluation
- markov decision process
- finite state
- decision making
- special case
- multi agent
- function approximation
- optimal policy
- artificial neural networks
- supervised learning
- active learning
- dynamic programming