Scaling Up Approximate Value Iteration with Options: Better Policies with Fewer Iterations.

Timothy A. Mann Shie Mannor

Published in: ICML (2014)

Keyphrases

approximate value iteration
fixed point
temporal difference learning
function approximation
sufficient conditions
optimal policy
markov decision problems
dynamical systems
reinforcement learning
game playing
markov decision process
training set
linear programming
policy iteration