Policy Evaluation with Temporal Differences: A Survey and Comparison (Extended Abstract).
Christoph DannGerhard NeumannJan PetersPublished in: ICAPS (2015)
Keyphrases
- extended abstract
- temporal difference
- policy evaluation
- td learning
- reinforcement learning
- function approximation
- evaluation function
- least squares
- model free
- monte carlo
- step size
- reinforcement learning algorithms
- policy iteration
- action selection
- variance reduction
- markov decision processes
- supervised learning
- semi parametric
- cost function
- data mining
- multiresolution
- learning algorithm