Login / Signup

A Single Transformer for Scalable Vision-Language Modeling.

Yangyi ChenXingyao WangHao PengHeng Ji
Published in: CoRR (2024)
Keyphrases