Kernel-Based Least Squares Temporal Difference With Gradient Correction.
Tianheng SongDazi LiLiulin CaoKotaro HirasawaPublished in: IEEE Trans. Neural Networks Learn. Syst. (2016)
Keyphrases
- temporal difference
- least squares
- policy evaluation
- td learning
- reinforcement learning
- function approximation
- evaluation function
- policy iteration
- monte carlo
- temporal difference learning
- reinforcement learning algorithms
- action selection
- temporal difference methods
- model free
- support vector machine
- step size
- policy gradient
- kernel methods
- supervised learning
- optical flow
- gradient method
- predictive state representations
- function approximators
- variance reduction
- actor critic
- reinforcement learning problems