Improved and Generalized Upper Bounds on the Complexity of Policy Iteration.

Published in: NIPS (2013)

Keyphrases

upper bound
policy iteration
worst case
markov decision processes
lower bound
model free
sample path
computational complexity
least squares
reinforcement learning
temporal difference
fixed point
policy evaluation
markov decision process
finite state
decision making
special case
multi agent
function approximation
optimal policy
artificial neural networks
supervised learning
active learning
dynamic programming