Pathologies of temporal difference methods in approximate dynamic programming.
Dimitri P. BertsekasPublished in: CDC (2010)
Keyphrases
- approximate dynamic programming
- temporal difference methods
- reinforcement learning
- temporal difference
- function approximation
- step size
- linear program
- policy search
- policy iteration
- dynamic programming
- function approximators
- evolutionary methods
- control policy
- td learning
- model free
- linear programming
- reinforcement learning algorithms
- state space
- action selection
- wavelet coefficients
- cost function
- computational complexity
- convergence speed
- neural network
- average cost
- evolutionary algorithm
- evaluation function
- markov decision processes
- multistage
- optimal policy
- support vector machine svm
- wavelet transform
- optical flow