Audio-Visual Child-Adult Speaker Classification in Dyadic Interactions.
Anfeng XuKevin HuangTiantian FengHelen Tager-FlusbergShrikanth NarayananPublished in: ICASSP (2024)
Keyphrases
- audio visual
- multi modal
- temporal context
- speaker verification
- visual information
- visual data
- audio visual speech recognition
- multi stream
- pattern recognition
- feature extraction
- multimedia
- machine learning
- image classification
- emotion recognition
- text classification
- feature space
- audio features
- person authentication
- feature vectors
- training set
- feature selection