Approximate Policy Iteration with a Policy Language Bias.
Alan FernSung Wook YoonRobert GivanPublished in: NIPS (2003)
Keyphrases
- approximate policy iteration
- policy iteration
- policy search
- reinforcement learning
- markov decision problems
- markov games
- markov decision process
- markov decision processes
- temporal difference
- optimal policy
- infinite horizon
- reinforcement learning algorithms
- machine learning
- fixed point
- optimal control
- least squares
- state space