Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing.
Haoyue ChengZhaoyang LiuHang ZhouChen QianWayne WuLimin WangPublished in: ECCV (34) (2022)
Keyphrases
- audio visual
- weakly supervised
- denoising
- visual data
- multimedia
- multi modal
- visual information
- object class
- video sequences
- superpixels
- topic models
- video data
- relation extraction
- object detectors
- natural images
- named entities
- semi supervised
- video frames
- natural language processing
- image data
- object recognition
- multi label
- contextual information
- input image
- data sets
- multimedia data
- human motion
- text mining
- high dimensional