Labeling audio-visual speech corpora and training an ANN/HMM audio-visual speech recognition system.
Martin HeckmannFrédéric BerthommierChristophe SavarioKristian KroschelPublished in: INTERSPEECH (2000)
Keyphrases
- audio visual
- multi stream
- multi modal
- artificial neural networks
- audio visual speech recognition
- visual information
- emotion recognition
- hidden markov models
- speaker verification
- person authentication
- multimedia
- natural language processing
- neural network
- training set
- audio features
- visual data
- low level
- speech recognition
- sound source
- image classification
- speech signal
- metadata