Publication: Neural Temporal Difference and Q Learning Provably Converge to Global Optima.