Login / Signup

Audio-Visual Information Fusion Using Cross-Modal Teacher-Student Learning for Voice Activity Detection in Realistic Environments.

Hengshun ZhouJun DuHang ChenZijun JingShifu XiongChin-Hui Lee
Published in: Interspeech (2021)
Keyphrases