On the asymptotic equivalence between differential Hebbian and temporal difference learning using a local third factor.
Christoph KolodziejskiBernd PorrMinija TamosiunaiteFlorentin WörgötterPublished in: NIPS (2008)
Keyphrases
- temporal difference learning
- function approximation
- fixed point
- loss bounds
- evaluation function
- reinforcement learning
- approximate value iteration
- temporal difference
- game playing
- markov decision process
- reinforcement learning algorithms
- model free
- function approximators
- sufficient conditions
- monte carlo
- dynamic environments
- worst case
- pairwise