Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing.
Yapeng TianDingzeyu LiChenliang XuPublished in: ECCV (3) (2020)
Keyphrases
- audio visual
- weakly supervised
- visual data
- multimedia
- multi modal
- object class
- visual information
- topic models
- video data
- relation extraction
- superpixels
- video sequences
- named entities
- multimedia data
- natural language processing
- video frames
- semi supervised
- natural language
- object detection
- object detectors
- key frames
- high dimensional data
- data sets
- visual features
- feature vectors
- image retrieval
- image sequences
- metadata