Login / Signup
Yaohui Cai
ORCID
Publication Activity (10 Years)
Years Active: 2019-2024
Publications (10 Years): 15
Top Topics
Smoothing Methods
Language Model
Comprehensive Evaluation
Ad Hoc Information Retrieval
Top Venues
CoRR
FPGA
EMC2@NeurIPS
ICML
</>
Publications
</>
Hongzheng Chen
,
Jiahao Zhang
,
Yixiao Du
,
Shaojie Xiang
,
Zichao Yue
,
Niansong Zhang
,
Yaohui Cai
,
Zhiru Zhang
A Comprehensive Evaluation of FPGA-Based Spatial Acceleration of LLMs.
FPGA
(2024)
Dingyi Dai
,
Yichi Zhang
,
Jiahao Zhang
,
Zhanqiu Hu
,
Yaohui Cai
,
Qi Sun
,
Zhiru Zhang
Trainable Fixed-Point Quantization for Deep Learning Acceleration on FPGAs.
CoRR
(2024)
Jerry Chee
,
Yaohui Cai
,
Volodymyr Kuleshov
,
Christopher De Sa
QuIP: 2-Bit Quantization of Large Language Models With Guarantees.
CoRR
(2023)
Jerry Chee
,
Yaohui Cai
,
Volodymyr Kuleshov
,
Christopher De Sa
QuIP: 2-Bit Quantization of Large Language Models With Guarantees.
NeurIPS
(2023)
Hongzheng Chen
,
Jiahao Zhang
,
Yixiao Du
,
Shaojie Xiang
,
Zichao Yue
,
Niansong Zhang
,
Yaohui Cai
,
Zhiru Zhang
Understanding the Potential of FPGA-Based Spatial Acceleration for Large Language Model Inference.
CoRR
(2023)
Yaohui Cai
,
Weizhe Hua
,
Hongzheng Chen
,
G. Edward Suh
,
Christopher De Sa
,
Zhiru Zhang
Structured Pruning is All You Need for Pruning CNNs at Initialization.
CoRR
(2022)
Wuxinlin Cheng
,
Chenhui Deng
,
Zhiqiang Zhao
,
Yaohui Cai
,
Zhiru Zhang
,
Zhuo Feng
SPADE: A Spectral Method for Black-Box Adversarial Robustness Evaluation.
CoRR
(2021)
Qijing Huang
,
Dequan Wang
,
Zhen Dong
,
Yizhao Gao
,
Yaohui Cai
,
Tian Li
,
Bichen Wu
,
Kurt Keutzer
,
John Wawrzynek
CoDeNet: Efficient Deployment of Input-Adaptive Object Detection on Embedded FPGAs.
FPGA
(2021)
Wuxinlin Cheng
,
Chenhui Deng
,
Zhiqiang Zhao
,
Yaohui Cai
,
Zhiru Zhang
,
Zhuo Feng
SPADE: A Spectral Method for Black-Box Adversarial Robustness Evaluation.
ICML
(2021)
Yaohui Cai
,
Zhewei Yao
,
Zhen Dong
,
Amir Gholami
,
Michael W. Mahoney
,
Kurt Keutzer
ZeroQ: A Novel Zero Shot Quantization Framework.
CVPR
(2020)
Qijing Huang
,
Dequan Wang
,
Yizhao Gao
,
Yaohui Cai
,
Zhen Dong
,
Bichen Wu
,
Kurt Keutzer
,
John Wawrzynek
Algorithm-hardware Co-design for Deformable Convolution.
CoRR
(2020)
Yaohui Cai
,
Zhewei Yao
,
Zhen Dong
,
Amir Gholami
,
Michael W. Mahoney
,
Kurt Keutzer
ZeroQ: A Novel Zero Shot Quantization Framework.
CoRR
(2020)
Zhen Dong
,
Dequan Wang
,
Qijing Huang
,
Yizhao Gao
,
Yaohui Cai
,
Bichen Wu
,
Kurt Keutzer
,
John Wawrzynek
CoDeNet: Algorithm-hardware Co-design for Deformable Convolution.
CoRR
(2020)
Qijing Huang
,
Dequan Wang
,
Yizhao Gao
,
Yaohui Cai
,
Zhen Dong
,
Bichen Wu
,
Kurt Keutzer
,
John Wawrzynek
Algorithm-hardware Co-design for Deformable Convolution.
EMC2@NeurIPS
(2019)
Zhen Dong
,
Zhewei Yao
,
Yaohui Cai
,
Daiyaan Arfeen
,
Amir Gholami
,
Michael W. Mahoney
,
Kurt Keutzer
HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks.
CoRR
(2019)