A unified view of TD algorithms, introducing Full-gradient TD and Equi-gradient descent TD.
Manuel LothPhilippe PreuxManuel DavyPublished in: ESANN (2007)
Keyphrases
- temporal difference
- learning algorithm
- td learning
- reinforcement learning
- policy evaluation
- computationally efficient
- reinforcement learning algorithms
- temporal difference learning
- orders of magnitude
- times faster
- function approximation
- optimization problems
- theoretical analysis
- computational complexity
- cost function
- computational cost
- evaluation function
- iterative methods
- data structure