Read, Watch, Listen, and Summarize: Multi-Modal Summarization for Asynchronous Text, Image, Audio and Video.
Haoran LiJunnan ZhuCong MaJiajun ZhangChengqing ZongPublished in: IEEE Trans. Knowl. Data Eng. (2019)
Keyphrases
- multi modal
- video search
- multiple modalities
- cross modal
- audio visual
- single modality
- visual data
- image data
- multimedia
- auto annotation
- image analysis
- multi modality
- semantic concepts
- input image
- image classification
- uni modal
- image features
- text graphics
- video files
- image retrieval
- image content
- web images
- image annotation
- image collections
- fusing multiple
- sports video
- image representation
- broadcast news
- similarity measure
- high dimensional
- low level
- video data
- segmentation method
- visual cues
- soccer video
- semantic information
- audio content
- image regions
- video frames
- high resolution
- video content