A Unified View of TD Algorithms; Introducing Full-Gradient TD and Equi-Gradient Descent TD
Manuel LothPhilippe PreuxPublished in: CoRR (2006)
Keyphrases
- td learning
- temporal difference
- learning algorithm
- reinforcement learning
- evaluation function
- temporal difference learning
- policy evaluation
- reinforcement learning algorithms
- computational cost
- theoretical analysis
- orders of magnitude
- computational complexity
- data structure
- function approximation
- multiscale
- policy iteration
- eligibility traces
- neural network
- benchmark datasets
- computationally efficient
- edge detection
- feature selection