Sign in

Vista-LLaMA: Reliable Video Narrator via Equal Distance to Visual Tokens.

Fan MaXiaojie JinHeng WangYuchen XianJiashi FengYi Yang
Published in: CoRR (2023)
Keyphrases