Sign in

Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization.

Yang JinZhicheng SunKun XuKun XuLiwei ChenHao JiangQuzhe HuangChengru SongYuliang LiuDi ZhangYang SongKun GaiYadong Mu
Published in: CoRR (2024)
Keyphrases