Publication: Multi-modal Summarization for Asynchronous Collection of Text, Image, Audio and Video.