Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing.
Yapeng TianDingzeyu LiChenliang XuPublished in: CoRR (2020)
Keyphrases
- audio visual
- weakly supervised
- visual data
- multimedia
- multi modal
- visual information
- relation extraction
- object class
- topic models
- superpixels
- video sequences
- video data
- video frames
- multimedia data
- visual features
- semi supervised
- named entities
- natural language
- key frames
- human actions
- bag of words
- contextual information
- natural language processing
- object detection
- image data