Correlating Subword Articulation with Lip Shapes for Embedding Aware Audio-Visual Speech Enhancement.
Hang ChenJun DuYu HuLi-Rong DaiBao-Cai YinChin-Hui LeePublished in: CoRR (2020)
Keyphrases
- audio visual
- speech enhancement
- audio visual speech recognition
- noisy environments
- speech recognition
- multi modal
- noise reduction
- speech signal
- signal to noise ratio
- multi stream
- single channel
- visual information
- multimedia
- linear prediction
- vocal tract
- visual data
- sound source
- speaker identification
- smoothing algorithm
- wiener filter
- visual speech
- multi channel
- hidden markov models
- automatic speech recognition
- broadcast news
- frequency domain
- data analysis
- information retrieval