Rollout sampling approximate policy iteration.
Christos DimitrakakisMichail G. LagoudakisPublished in: Mach. Learn. (2008)
Keyphrases
- approximate policy iteration
- reinforcement learning
- policy iteration
- markov decision problems
- policy search
- temporal difference
- markov games
- markov decision processes
- optimal policy
- state space
- markov decision process
- multiagent reinforcement learning
- linear programming
- partially observable
- reinforcement learning algorithms
- fixed point
- function approximators
- function approximation
- control problems
- model free
- multi agent
- linear program
- average reward
- infinite horizon
- finite state
- expected utility
- machine learning
- evaluation function
- least squares
- decision theoretic
- partially observable markov decision processes
- convergence rate
- utility function
- monte carlo
- optical flow
- search space
- decision making