Login / Signup
Separate in the Speech Chain: Cross-Modal Conditional Audio-Visual Target Speech Extraction.
Zhaoxi Mu
Xinyu Yang
Published in:
CoRR (2024)
Keyphrases
</>
audio visual
multi modal
cross modal
visual data
multi stream
visual information
multimedia
audio features
information retrieval
high dimensional
spatio temporal