DNN Driven Speaker Independent Audio-Visual Mask Estimation for Speech Separation.
Mandar GogateAhsan AdeelRicard MarxerJon BarkerAmir HussainPublished in: INTERSPEECH (2018)
Keyphrases
- audio visual
- digit recognition
- speaker independent
- multi modal
- speech recognition
- visual information
- emotion recognition
- speaker dependent
- multi stream
- multimedia
- sound source
- hidden markov models
- visual data
- speech recognizer
- speaker verification
- audio visual speech recognition
- language model
- speaker identification
- acoustic models
- automatic speech recognition
- audio features
- mixture model
- low level
- pattern recognition
- search engine
- information retrieval
- machine learning