Login / Signup

Fewer Tokens and Fewer Videos: Extending Video Understanding Abilities in Large Vision-Language Models.

Shimin ChenYitian YuanShaoxiang ChenZequn JieLin Ma
Published in: CoRR (2024)
Keyphrases