A Closer Look at Audio-Visual Multi-Person Speech Recognition and Active Speaker Selection.
Otavio BragaOlivier SiohanPublished in: ICASSP (2021)
Keyphrases
- audio visual
- speech recognition
- audio visual speech recognition
- multi modal
- multi stream
- hidden markov models
- speaker verification
- speaker identification
- emotion recognition
- automatic speech recognition
- speech signal
- visual information
- speech synthesis
- language model
- speech recognizer
- pattern recognition
- digit recognition
- speech recognition systems
- speaker independent
- speaker dependent
- multimedia
- visual data
- audio features
- noisy environments
- speaker diarization
- data mining
- search engine
- speaker adaptation