Audio-Visual Multi-Channel Integration and Recognition of Overlapped Speech.
Jianwei YuShi-Xiong ZhangBo WuShansong LiuShoukang HuMengzhe GengXunying LiuHelen MengDong YuPublished in: IEEE ACM Trans. Audio Speech Lang. Process. (2021)
Keyphrases
- audio visual
- multi channel
- digit recognition
- multi modal
- visual information
- emotion recognition
- multi stream
- single channel
- visual data
- multimedia
- feature extraction
- object recognition
- speaker verification
- audio visual speech recognition
- action recognition
- human activities
- speech recognition
- visual features
- image data
- image sequences