Audio-visual child-adult speaker classification in dyadic interactions.
Anfeng XuKevin HuangTiantian FengHelen Tager-FlusbergShrikanth NarayananPublished in: CoRR (2023)
Keyphrases
- audio visual
- multi modal
- visual information
- pattern recognition
- speaker verification
- visual data
- multi stream
- machine learning
- multimedia
- feature extraction
- temporal context
- emotion recognition
- audio visual speech recognition
- person authentication
- text classification
- image classification
- feature vectors
- feature selection
- visual features
- training set
- feature space