Login / Signup
Performance bounds for λ policy iteration and application to the game of Tetris.
Bruno Scherrer
Published in:
J. Mach. Learn. Res. (2013)
Keyphrases
</>
policy iteration
markov decision processes
learning algorithm
lower bound
decision making
reinforcement learning
least squares
optimal policy
monte carlo
optimal control