LiMuSE: Lightweight Multi-modal Speaker Extraction.
Qinghua LiuYating HuangYunzhe HaoJiaming XuBo XuPublished in: CoRR (2021)
Keyphrases
- multi modal
- lightweight
- audio visual
- multi modality
- speaker verification
- communication infrastructure
- semantic concepts
- uni modal
- development environments
- image annotation
- high dimensional
- humanoid robot
- rfid tags
- video search
- speech recognition
- low cost
- multiple modalities
- wireless sensor networks
- single modality
- image segmentation