Neural Temporal-Difference Learning Converges to Global Optima.
Qi CaiZhuoran YangJason D. LeeZhaoran WangPublished in: CoRR (2019)
Keyphrases
- temporal difference learning
- global optima
- global optimization
- optimization problems
- function approximation
- global optimum
- optimization algorithm
- fixed point
- game playing
- reinforcement learning
- evaluation function
- optimal solution
- control parameters
- neural network
- temporal difference
- global search
- reinforcement learning algorithms
- estimation of distribution algorithms
- evolutionary algorithm
- markov decision process
- function optimization
- premature convergence
- search space
- optimization method
- training set
- graphical models
- search algorithm
- objective function
- dynamic programming