A large-scale multimodal dataset of human speech recognition.
Yao GeChong TangHaobo LiZikang ZhangWenda LiKevin ChettyDaniele FaccioQammer H. AbbasiMuhammad ImranPublished in: CoRR (2023)
Keyphrases
- speech recognition
- speech processing
- hidden markov models
- speech synthesis
- speech recognizer
- pattern recognition
- speech recognition systems
- language model
- automatic speech recognition
- speech understanding
- speech signal
- noisy environments
- speech recognition technology
- multi modal
- speaker identification
- speech retrieval
- speech recognition errors
- speaker independent
- keyword spotting
- speech recognizers
- isolated word
- speaker dependent
- speaker adaptation
- probabilistic model
- multimedia
- image processing