Performance bounds for λ policy iteration and application to the game of Tetris.

Published in: J. Mach. Learn. Res. (2013)

Keyphrases

policy iteration
markov decision processes
learning algorithm
lower bound
decision making
reinforcement learning
least squares
optimal policy
monte carlo
optimal control