Publication: Text-to-Audio Grounding: Building Correspondence Between Captions and Sound Events.