Deep Audio-Visual Beamforming for Speaker Localization.
Xinyuan QianQiquan ZhangGuohui GuanWei XuePublished in: IEEE Signal Process. Lett. (2022)
Keyphrases
- audio visual
- sound source
- multi modal
- speaker verification
- visual information
- visual data
- temporal context
- multimedia
- multi stream
- emotion recognition
- audio visual speech recognition
- person authentication
- audio features
- data processing
- principal component analysis
- pattern recognition
- three dimensional
- computer vision
- data sets