Parameter-free Gradient Temporal Difference Learning.
Andrew JacobsenAlan ChanPublished in: CoRR (2021)
Keyphrases
- parameter free
- temporal difference learning
- function approximation
- fixed point
- evaluation function
- reinforcement learning
- categorical data
- game playing
- temporal difference
- outlier detection
- markov decision process
- fully automatic
- reinforcement learning algorithms
- feature space
- function approximators
- unsupervised learning
- monte carlo
- dynamic programming