Integration of audio-visual information for multi-speaker multimedia speaker recognition.
Jichen YangFangfan ChenYu ChengPei LinPublished in: Digit. Signal Process. (2024)
Keyphrases
- visual information
- speaker recognition
- audio visual
- speaker verification
- multimedia
- visual features
- visual data
- speaker identification
- low level
- emotion recognition
- gaussian mixture model
- vector quantization
- visual content
- audio features
- probabilistic neural network
- eye movements
- metadata
- speech recognition
- noisy environments
- image collections
- acoustic features
- mel frequency cepstral coefficients
- semantic information
- machine learning