Multi-modal Summarization for Asynchronous Collection of Text, Image, Audio and Video.
Haoran LiJunnan ZhuCong MaJiajun ZhangChengqing ZongPublished in: EMNLP (2017)
Keyphrases
- multi modal
- video search
- multiple modalities
- single modality
- audio visual
- cross modal
- visual data
- auto annotation
- semantic concepts
- image analysis
- multimedia
- uni modal
- video files
- multi modality
- image features
- image representation
- text graphics
- image retrieval
- input image
- image classification
- image annotation
- fusing multiple
- image content
- video data
- image collections
- low level
- high dimensional
- broadcast news
- key frames
- visual information
- visual concepts
- video sequences
- keywords
- image processing
- soccer video
- sports video
- imaging modalities
- video content
- video streams
- video frames
- image regions
- medical images