CI-AVSR: A Cantonese Audio-Visual Speech Dataset for In-car Command Recognition.
Wenliang DaiSamuel CahyawijayaTiezheng YuElham J. BareziPeng XuCheuk Tung Shadow YiuRita FrieskeHoly LoveniaGenta Indra WinataQifeng ChenXiaojuan MaBertram E. ShiPascale FungPublished in: CoRR (2022)
Keyphrases
- visual speech
- hidden markov models
- visual speech recognition
- lip reading
- speaker identification
- audio visual speech recognition
- noisy environments
- audio signals
- cepstral coefficients
- video signals
- speech recognition
- acoustic features
- broadcast news
- text to speech
- gaussian mixture model
- multimedia
- visual information
- pattern recognition