​
Login / Signup
Haogeng Liu
Publication Activity (10 Years)
Years Active: 2023-2024
Publications (10 Years): 7
Top Topics
Language Model
Visual Data
Multi Modal
Speech Synthesis
Top Venues
CoRR
ACL (Findings)
</>
Publications
</>
Haogeng Liu
,
Quanzeng You
,
Xiaotian Han
,
Yiqi Wang
,
Bohan Zhai
,
Yongfei Liu
,
Yunzhe Tao
,
Huaibo Huang
,
Ran He
,
Hongxia Yang
InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding.
CoRR
(2024)
Haogeng Liu
,
Quanzeng You
,
Yiqi Wang
,
Xiaotian Han
,
Bohan Zhai
,
Yongfei Liu
,
Wentao Chen
,
Yiren Jian
,
Yunzhe Tao
,
Jianbo Yuan
,
Ran He
,
Hongxia Yang
InfiMM: Advancing Multimodal Understanding with an Open-Sourced Visual Language Model.
ACL (Findings)
(2024)
Haogeng Liu
,
Quanzeng You
,
Xiaotian Han
,
Yongfei Liu
,
Huaibo Huang
,
Ran He
,
Hongxia Yang
Visual Anchors Are Strong Information Aggregators For Multimodal Large Language Model.
CoRR
(2024)
Tingkai Liu
,
Yunzhe Tao
,
Haogeng Liu
,
Qihang Fan
,
Ding Zhou
,
Huaibo Huang
,
Ran He
,
Hongxia Yang
Video-CSR: Complex Video Digest Creation for Visual-Language Models.
CoRR
(2023)
Haogeng Liu
,
Tao Wang
,
Ruibo Fu
,
Jiangyan Yi
,
Zhengqi Wen
,
Jianhua Tao
UnifySpeech: A Unified Framework for Zero-shot Text-to-Speech and Voice Conversion.
CoRR
(2023)
Haogeng Liu
,
Tao Wang
,
Jie Cao
,
Ran He
,
Jianhua Tao
Boosting Fast and High-Quality Speech Synthesis with Linear Diffusion.
CoRR
(2023)
Haogeng Liu
,
Qihang Fan
,
Tingkai Liu
,
Linjie Yang
,
Yunzhe Tao
,
Huaibo Huang
,
Ran He
,
Hongxia Yang
Video-Teller: Enhancing Cross-Modal Generation with Fusion and Decoupling.
CoRR
(2023)