Approximate modified policy iteration and its application to the game of Tetris.
Bruno ScherrerMohammad GhavamzadehVictor GabillonBoris LesnerMatthieu GeistPublished in: J. Mach. Learn. Res. (2015)
Keyphrases
- policy iteration
- policy evaluation
- markov decision processes
- factored mdps
- approximate policy iteration
- model free
- temporal difference
- markov games
- reinforcement learning
- optimal policy
- fixed point
- minimax search
- least squares
- finite state
- evaluation function
- sample path
- infinite horizon
- markov decision process
- game theory
- game playing
- stochastic games
- neural network
- nash equilibrium
- linear programming
- monte carlo
- average reward
- machine learning
- state space
- markov decision problems
- variance reduction
- convergence rate
- reinforcement learning algorithms
- semi parametric
- hybrid algorithms
- nash equilibria
- function approximation
- search space
- optimal strategy