Sign in

Improving Visual Speech Enhancement Network by Learning Audio-visual Affinity with Multi-head Attention.

Xinmeng XuYang WangJie JiaBinbin ChenDejun Li
Published in: INTERSPEECH (2022)
Keyphrases
  • audio visual
  • multi modal
  • visual information
  • prior knowledge
  • image processing
  • visual data
  • computer vision
  • visual features
  • multi stream
  • speech enhancement