An audio-visual corpus for multimodal automatic speech recognition.
Andrzej CzyzewskiBozena KostekPiotr BratoszewskiJozef KotusMarcin S. SzczukaPublished in: J. Intell. Inf. Syst. (2017)
Keyphrases
- audio visual
- automatic speech recognition
- conversational speech
- spontaneous speech
- speech recognition
- multi modal
- hidden markov models
- speech signal
- visual information
- broadcast news
- multi stream
- visual data
- multimodal fusion
- multimedia
- noisy environments
- acoustic features
- emotion recognition
- audio features
- passage retrieval
- speaker verification
- pattern recognition
- machine learning