Publication: Neural Trust Region/Proximal Policy Optimization Attains Globally Optimal Policy.