On the Asymptotic Equivalence Between Differential Hebbian and Temporal Difference Learning.
Christoph KolodziejskiBernd PorrFlorentin WörgötterPublished in: Neural Comput. (2009)
Keyphrases
- temporal difference learning
- loss bounds
- fixed point
- function approximation
- reinforcement learning
- evaluation function
- game playing
- approximate value iteration
- temporal difference
- reinforcement learning algorithms
- markov decision process
- markov decision processes
- policy iteration
- function approximators
- worst case
- optimal policy
- machine learning
- dynamic environments