LAVSS: Location-Guided Audio-Visual Spatial Audio Separation.
Yuxin YeWenming YangYapeng TianPublished in: CoRR (2023)
Keyphrases
- audio visual
- multi modal
- visual information
- multi stream
- sound source
- audio visual speech recognition
- spatio temporal
- spatial information
- audio features
- emotion recognition
- visual data
- spatial and temporal
- spatial data
- multimodal fusion
- multimedia
- person authentication
- audio visual content
- high dimensional
- video frames
- space time
- low level
- keywords