Reformulating Vision-Language Foundation Models and Datasets Towards Universal Multimodal Assistants.
Tianyu YuJinyi HuYuan YaoHaoye ZhangYue ZhaoChongyi WangShan WangYinxv PanJiao XueDahai LiZhiyuan LiuHai-Tao ZhengMaosong SunPublished in: CoRR (2023)