A Real-Time Active Speaker Detection System Integrating an Audio-Visual Signal with a Spatial Querying Mechanism.
Ilya GurvichIdo LeichterDharmendar Reddy PalleYossi AsherAlon VinnikovIgor AbramovskiVishak GopalRoss CutlerEyal KrupkaPublished in: ICASSP (2024)
Keyphrases
- audio visual
- multi modal
- visual information
- speaker verification
- multi stream
- visual data
- multimedia
- emotion recognition
- spatio temporal
- databases
- audio features
- image database
- spatial data
- sound source
- audio visual speech recognition
- spatial and temporal
- spatial information
- visual content
- noisy environments
- spatial relationships
- query processing
- high level
- data sets