Login / Signup

Separate in the Speech Chain: Cross-Modal Conditional Audio-Visual Target Speech Extraction.

Zhaoxi MuXinyu Yang
Published in: CoRR (2024)
Keyphrases
  • audio visual
  • multi modal
  • cross modal
  • visual data
  • multi stream
  • visual information
  • multimedia
  • audio features
  • information retrieval
  • high dimensional
  • spatio temporal