Audio-visual phoneme classification for pronunciation training applications.
Hedvig KjellströmOlov EngwallSherif Mahdy AbdouOlle BälterPublished in: INTERSPEECH (2007)
Keyphrases
- audio visual
- speech recognition
- multi modal
- training set
- digit recognition
- pattern recognition
- multi stream
- visual information
- feature extraction
- multimedia
- data sets
- feature vectors
- emotion recognition
- metadata
- context aware
- machine learning
- audio visual speech recognition
- text classification
- co occurrence
- feature space
- decision trees
- feature selection