Sign in

VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset.

Sihan ChenXingjian HeLongteng GuoXinxin ZhuWeining WangJinhui TangJing Liu
Published in: CoRR (2023)
Keyphrases