Weak Convergence Properties of Constrained Emphatic Temporal-difference Learning with Constant and Slowly Diminishing Stepsize.
Huizhen YuPublished in: J. Mach. Learn. Res. (2016)
Keyphrases
- temporal difference learning
- temporal difference
- step size
- convergence rate
- faster convergence
- function approximation
- reinforcement learning
- quasi newton
- fixed point
- convergence speed
- evaluation function
- monte carlo
- game playing
- reinforcement learning algorithms
- artificial neural networks
- model free
- multiscale
- multi objective
- markov decision process
- policy iteration
- function approximators