Improved and Generalized Upper Bounds on the Complexity of Policy Iteration.

Published in: Math. Oper. Res. (2016)

Keyphrases

upper bound
policy iteration
worst case
lower bound
markov decision processes
model free
optimal policy
fixed point
reinforcement learning
sample path
finite state
temporal difference
least squares
sample size
policy evaluation
infinite horizon
search algorithm
model checking
np hard
special case
artificial neural networks
training set
markov decision problems
optimal solution