Approximate policy iteration using regularised Bellman residuals minimisation.
Gennaro EspositoMario MartínPublished in: J. Exp. Theor. Artif. Intell. (2016)
Keyphrases
- approximate policy iteration
- policy iteration
- least squares
- reinforcement learning
- markov games
- markov decision problems
- markov decision processes
- linear program
- policy search
- temporal difference
- state action
- linear programming
- temporal difference learning
- multiagent reinforcement learning
- function approximators
- model free
- optimal policy
- function approximation
- markov decision process
- reinforcement learning algorithms
- state space
- fixed point
- learning algorithm
- expected utility