Egocentric Deep Multi-Channel Audio-Visual Active Speaker Localization.
Hao JiangCalvin MurdockVamsi Krishna IthapuPublished in: CVPR (2022)
Keyphrases
- audio visual
- multi channel
- video summarization
- multi modal
- visual information
- single channel
- speaker verification
- audio visual speech recognition
- emotion recognition
- visual data
- multi stream
- multimedia
- audio features
- person authentication
- search engine
- visual features
- sensor networks
- low level
- sound source
- spatio temporal
- keywords