Login / Signup

VatLM: Visual-Audio-Text Pre-Training With Unified Masked Prediction for Speech Representation Learning.

Qiushi ZhuLong ZhouZiqiang ZhangShujie LiuBinxing JiaoJie ZhangLi-Rong DaiDaxin JiangJinyu LiFuru Wei
Published in: IEEE Trans. Multim. (2024)
Keyphrases