Exploiting Audio-Visual Consistency with Partial Supervision for Spatial Audio Generation.
Yan-Bo LinYu-Chiang Frank WangPublished in: CoRR (2021)
Keyphrases
- audio visual
- multi modal
- visual information
- visual data
- audio features
- emotion recognition
- audio visual speech recognition
- multi stream
- spatial data
- speaker verification
- spatio temporal
- person authentication
- spatial information
- spatial relations
- multimodal fusion
- search engine
- multimedia
- spatial relationships
- spatial and temporal
- low level
- high dimensional
- video sequences