Publication: STRONG: Spatio-Temporal Reinforcement Learning for Cross-Modal Video Moment Localization.