Sign in
Jiabo Ye
Publication Activity (10 Years)
Years Active: 2021-2024
Publications (10 Years): 21
Top Topics
Cross Modal
Named Entity Recognition
Language Model
N Gram
Top Venues
CoRR
EMNLP
ICME
ICASSP
</>
Publications
</>
Junyang Wang
,
Haiyang Xu
,
Jiabo Ye
,
Ming Yan
,
Weizhou Shen
,
Ji Zhang
,
Fei Huang
,
Jitao Sang
Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception.
CoRR
(2024)
Chenlin Zhao
,
Jiabo Ye
,
Yaguang Song
,
Ming Yan
,
Xiaoshan Yang
,
Changsheng Xu
Part-Aware Prompt Tuning for Weakly Supervised Referring Expression Grounding.
MMM (3)
(2024)
Jianglin Jin
,
Jiabo Ye
,
Xin Lin
,
Liang He
Pseudo-Query Generation For Semi-Supervised Visual Grounding With Knowledge Distillation.
ICASSP
(2023)
Haiyang Xu
,
Qinghao Ye
,
Ming Yan
,
Yaya Shi
,
Jiabo Ye
,
Yuanhong Xu
,
Chenliang Li
,
Bin Bi
,
Qi Qian
,
Wei Wang
,
Guohai Xu
,
Ji Zhang
,
Songfang Huang
,
Fei Huang
,
Jingren Zhou
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video.
ICML
(2023)
Jiabo Ye
,
Anwen Hu
,
Haiyang Xu
,
Qinghao Ye
,
Ming Yan
,
Guohai Xu
,
Chenliang Li
,
Junfeng Tian
,
Qi Qian
,
Ji Zhang
,
Qin Jin
,
Liang He
,
Xin Alex Lin
,
Fei Huang
UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model.
CoRR
(2023)
Anwen Hu
,
Yaya Shi
,
Haiyang Xu
,
Jiabo Ye
,
Qinghao Ye
,
Ming Yan
,
Chenliang Li
,
Qi Qian
,
Ji Zhang
,
Fei Huang
mPLUG-PaperOwl: Scientific Diagram Analysis with the Multimodal Large Language Model.
CoRR
(2023)
Haiyang Xu
,
Qinghao Ye
,
Xuan Wu
,
Ming Yan
,
Yuan Miao
,
Jiabo Ye
,
Guohai Xu
,
Anwen Hu
,
Yaya Shi
,
Guangwei Xu
,
Chenliang Li
,
Qi Qian
,
Maofei Que
,
Ji Zhang
,
Xiao Zeng
,
Fei Huang
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks.
CoRR
(2023)
Jiabo Ye
,
Anwen Hu
,
Haiyang Xu
,
Qinghao Ye
,
Ming Yan
,
Yuhao Dan
,
Chenlin Zhao
,
Guohai Xu
,
Chenliang Li
,
Junfeng Tian
,
Qian Qi
,
Ji Zhang
,
Fei Huang
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding.
CoRR
(2023)
Qinghao Ye
,
Haiyang Xu
,
Guohai Xu
,
Jiabo Ye
,
Ming Yan
,
Yiyang Zhou
,
Junyang Wang
,
Anwen Hu
,
Pengcheng Shi
,
Yaya Shi
,
Chenliang Li
,
Yuanhong Xu
,
Hehong Chen
,
Junfeng Tian
,
Qian Qi
,
Ji Zhang
,
Fei Huang
mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality.
CoRR
(2023)
Haiyang Xu
,
Qinghao Ye
,
Ming Yan
,
Yaya Shi
,
Jiabo Ye
,
Yuanhong Xu
,
Chenliang Li
,
Bin Bi
,
Qi Qian
,
Wei Wang
,
Guohai Xu
,
Ji Zhang
,
Songfang Huang
,
Fei Huang
,
Jingren Zhou
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video.
CoRR
(2023)
Jiabo Ye
,
Anwen Hu
,
Haiyang Xu
,
Qinghao Ye
,
Ming Yan
,
Guohai Xu
,
Chenliang Li
,
Junfeng Tian
,
Qi Qian
,
Ji Zhang
,
Qin Jin
,
Liang He
,
Xin Lin
,
Fei Huang
UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model.
EMNLP (Findings)
(2023)
Qinghao Ye
,
Haiyang Xu
,
Jiabo Ye
,
Ming Yan
,
Anwen Hu
,
Haowei Liu
,
Qi Qian
,
Ji Zhang
,
Fei Huang
,
Jingren Zhou
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration.
CoRR
(2023)
Jiabo Ye
,
Junfeng Tian
,
Ming Yan
,
Xiaoshan Yang
,
Xuwu Wang
,
Ji Zhang
,
Liang He
,
Xin Lin
Shifting More Attention to Visual Backbone: Query-modulated Refinement Networks for End-to-End Visual Grounding.
CVPR
(2022)
Xuwu Wang
,
Jiabo Ye
,
Zhixu Li
,
Junfeng Tian
,
Yong Jiang
,
Ming Yan
,
Ji Zhang
,
Yanghua Xiao
CAT-MNER: Multimodal Named Entity Recognition with Knowledge-Refined Cross-Modal Attention.
ICME
(2022)
Chenliang Li
,
Haiyang Xu
,
Junfeng Tian
,
Wei Wang
,
Ming Yan
,
Bin Bi
,
Jiabo Ye
,
He Chen
,
Guohai Xu
,
Zheng Cao
,
Ji Zhang
,
Songfang Huang
,
Fei Huang
,
Jingren Zhou
,
Luo Si
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections.
EMNLP
(2022)
Chenliang Li
,
Haiyang Xu
,
Junfeng Tian
,
Wei Wang
,
Ming Yan
,
Bin Bi
,
Jiabo Ye
,
Hehong Chen
,
Guohai Xu
,
Zheng Cao
,
Ji Zhang
,
Songfang Huang
,
Fei Huang
,
Jingren Zhou
,
Luo Si
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections.
CoRR
(2022)
Zijing Yang
,
Jiabo Ye
,
Linlin Wang
,
Xin Lin
,
Liang He
Inferring substitutable and complementary products with Knowledge-Aware Path Reasoning based on dynamic policy network.
Knowl. Based Syst.
235 (2022)
Xuwu Wang
,
Junfeng Tian
,
Min Gui
,
Zhixu Li
,
Jiabo Ye
,
Ming Yan
,
Yanghua Xiao
PromptMNER: Prompt-Based Entity-Related Visual Clue Extraction and Integration for Multimodal Named Entity Recognition.
DASFAA (3)
(2022)
Jiabo Ye
,
Junfeng Tian
,
Ming Yan
,
Xiaoshan Yang
,
Xuwu Wang
,
Ji Zhang
,
Liang He
,
Xin Lin
Shifting More Attention to Visual Backbone: Query-modulated Refinement Networks for End-to-End Visual Grounding.
CoRR
(2022)
Jiabo Ye
,
Xin Lin
,
Liang He
,
Dingbang Li
,
Qin Chen
One-Stage Visual Grounding via Semantic-Aware Feature Filter.
ACM Multimedia
(2021)
Zijing Yang
,
Jiabo Ye
,
Linlin Wang
,
Xin Lin
,
Liang He
Inferring Substitutable and Complementary Products with Knowledge-Aware Path Reasoning based on Dynamic Policy Network.
CoRR
(2021)