​
Login / Signup
Bohan Zhai
Publication Activity (10 Years)
Years Active: 2020-2024
Publications (10 Years): 15
Top Topics
Speech Recognition
False Matches
Language Model
N Gram
Top Venues
CoRR
ACL (Findings)
ICASSP
ECCV (37)
</>
Publications
</>
Xiaotian Han
,
Yiqi Wang
,
Bohan Zhai
,
Quanzeng You
,
Hongxia Yang
COCO is "ALL" You Need for Visual Instruction Fine-tuning.
CoRR
(2024)
Haogeng Liu
,
Quanzeng You
,
Xiaotian Han
,
Yiqi Wang
,
Bohan Zhai
,
Yongfei Liu
,
Yunzhe Tao
,
Huaibo Huang
,
Ran He
,
Hongxia Yang
InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding.
CoRR
(2024)
Sheng Shen
,
Shijia Yang
,
Tianjun Zhang
,
Bohan Zhai
,
Joseph E. Gonzalez
,
Kurt Keutzer
,
Trevor Darrell
Multitask Vision-Language Prompt Tuning.
WACV
(2024)
Yiqi Wang
,
Wentao Chen
,
Xiaotian Han
,
Xudong Lin
,
Haiteng Zhao
,
Yongfei Liu
,
Bohan Zhai
,
Jianbo Yuan
,
Quanzeng You
,
Hongxia Yang
Exploring the Reasoning Abilities of Multimodal Large Language Models (MLLMs): A Comprehensive Survey on Emerging Trends in Multimodal Reasoning.
CoRR
(2024)
Haogeng Liu
,
Quanzeng You
,
Yiqi Wang
,
Xiaotian Han
,
Bohan Zhai
,
Yongfei Liu
,
Wentao Chen
,
Yiren Jian
,
Yunzhe Tao
,
Jianbo Yuan
,
Ran He
,
Hongxia Yang
InfiMM: Advancing Multimodal Understanding with an Open-Sourced Visual Language Model.
ACL (Findings)
(2024)
Bohan Zhai
,
Shijia Yang
,
Xiangchen Zhao
,
Chenfeng Xu
,
Sheng Shen
,
Dongdi Zhao
,
Kurt Keutzer
,
Manling Li
,
Tan Yan
,
Xiangjun Fan
HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision Language Models for Detailed Caption.
CoRR
(2023)
Xiaotian Han
,
Quanzeng You
,
Yongfei Liu
,
Wentao Chen
,
Huangjie Zheng
,
Khalil Mrini
,
Xudong Lin
,
Yiqi Wang
,
Bohan Zhai
,
Jianbo Yuan
,
Heng Wang
,
Hongxia Yang
CORE-MM: Complex Open-Ended Reasoning Evaluation For Multi-Modal Large Language Models.
CoRR
(2023)
Sehoon Kim
,
Amir Gholami
,
Zhewei Yao
,
Nicholas Lee
,
Patrick Wang
,
Aniruddha Nrusimha
,
Bohan Zhai
,
Tianren Gao
,
Michael W. Mahoney
,
Kurt Keutzer
Integer-Only Zero-Shot Quantization for Efficient Speech Recognition.
ICASSP
(2022)
Chenfeng Xu
,
Shijia Yang
,
Tomer Galanti
,
Bichen Wu
,
Xiangyu Yue
,
Bohan Zhai
,
Wei Zhan
,
Peter Vajda
,
Kurt Keutzer
,
Masayoshi Tomizuka
Image2Point: 3D Point-Cloud Understanding with 2D Image Pretrained Models.
ECCV (37)
(2022)
Sheng Shen
,
Shijia Yang
,
Tianjun Zhang
,
Bohan Zhai
,
Joseph E. Gonzalez
,
Kurt Keutzer
,
Trevor Darrell
Multitask Vision-Language Prompt Tuning.
CoRR
(2022)
Chenfeng Xu
,
Shijia Yang
,
Bohan Zhai
,
Bichen Wu
,
Xiangyu Yue
,
Wei Zhan
,
Peter Vajda
,
Kurt Keutzer
,
Masayoshi Tomizuka
Image2Point: 3D Point-Cloud Understanding with Pretrained 2D ConvNets.
CoRR
(2021)
Sehoon Kim
,
Amir Gholami
,
Zhewei Yao
,
Aniruddha Nrusimha
,
Bohan Zhai
,
Tianren Gao
,
Michael W. Mahoney
,
Kurt Keutzer
Q-ASR: Integer-only Zero-shot Quantization for Efficient Speech Recognition.
CoRR
(2021)
Chenfeng Xu
,
Bohan Zhai
,
Bichen Wu
,
Tian Li
,
Wei Zhan
,
Peter Vajda
,
Kurt Keutzer
,
Masayoshi Tomizuka
You Only Group Once: Efficient Point-Cloud Processing with Token Representation and Relation Inference Module.
IROS
(2021)
Chenfeng Xu
,
Bohan Zhai
,
Bichen Wu
,
Tian Li
,
Wei Zhan
,
Peter Vajda
,
Kurt Keutzer
,
Masayoshi Tomizuka
You Only Group Once: Efficient Point-Cloud Processing with Token Representation and Relation Inference Module.
CoRR
(2021)
Bohan Zhai
,
Tianren Gao
,
Flora Xue
,
Daniel Rothchild
,
Bichen Wu
,
Joseph E. Gonzalez
,
Kurt Keutzer
SqueezeWave: Extremely Lightweight Vocoders for On-device Speech Synthesis.
CoRR
(2020)