Publication: Estimation and Approximation Bounds for Gradient-Based Reinforcement Learning.