A New View of Multi-modal Language Analysis: Audio and Video Features as Text "Styles".
Zhongkai SunPrathusha Kameswara SarmaYingyu LiangWilliam A. SetharesPublished in: EACL (2021)
Keyphrases
- multi modal
- multiple modalities
- video search
- audio visual
- cross modal
- multimedia
- multi modality
- feature extraction
- semantic concepts
- video analysis
- co occurrence
- audio content
- video data
- text to speech
- video streams
- broadcast news
- video clips
- feature vectors
- feature space
- keywords
- fusing multiple
- uni modal
- video segments
- auto annotation
- video database
- multimedia databases
- video retrieval
- image annotation
- text retrieval
- text documents
- medical images
- relevance feedback
- image features
- low level
- high dimensional