Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition.
Guinan LiJiajun DengMengzhe GengZengrui JinTianzi WangShujie HuMingyu CuiHelen MengXunying LiuPublished in: CoRR (2023)
Keyphrases
- audio visual
- end to end
- multi channel
- multi modal
- sound source
- single channel
- visual information
- multi stream
- wireless ad hoc networks
- emotion recognition
- visual data
- multimedia
- multipath
- pattern recognition
- text localization and recognition
- feature extraction
- audio visual speech recognition
- ad hoc networks
- activity recognition
- congestion control
- audio features
- mac protocol
- action recognition
- low level