Dual-path Attention is All You Need for Audio-Visual Speech Extraction.
Zhongweiyang XuXulin FanMark Hasegawa-JohnsonPublished in: CoRR (2022)
Keyphrases
- visual speech
- hidden markov models
- visual speech recognition
- audio visual speech recognition
- speaker identification
- audio signals
- lip reading
- acoustic features
- noisy environments
- visual attention
- audio signal
- video signals
- multimedia
- speech signal
- broadcast news
- audio visual
- visual information
- keywords
- multi stream
- information retrieval systems
- pattern recognition