Exploiting Audio-Visual Consistency with Partial Supervision for Spatial Audio Generation.
Yan-Bo LinYu-Chiang Frank WangPublished in: AAAI (2021)
Keyphrases
- audio visual
- multi modal
- visual information
- audio features
- visual data
- multi stream
- audio visual speech recognition
- multimedia
- emotion recognition
- spatial data
- person authentication
- spatial information
- multimodal fusion
- spatial and temporal
- speaker verification
- audio visual content
- spatio temporal
- visual features
- low level
- high dimensional