Discovering Sounding Objects by Audio Queries for Audio Visual Segmentation.
Shaofei HuangHan LiYuqing WangHongji ZhuJiao DaiJizhong HanWenge RongSi LiuPublished in: IJCAI (2023)
Keyphrases
- audio visual
- visual data
- multi modal
- temporal segmentation
- visual information
- video scene
- emotion recognition
- multimodal fusion
- multi stream
- database
- audio visual speech recognition
- query processing
- audio features
- multimedia
- multiscale
- audio visual content
- high dimensional
- spatial relationships
- multimedia data
- image regions
- high dimensional data
- visual features
- data sources
- image sequences