Login / Signup
Xuanlei Zhao
ORCID
Publication Activity (10 Years)
Years Active: 2024-2024
Publications (10 Years): 7
Top Topics
Bayesian Inference
Resource Constraints
Memory Efficient
Language Modelling
Top Venues
CoRR
ICLR
PPoPP
MLSys
</>
Publications
</>
Ziming Liu
,
Shaoyu Wang
,
Shenggan Cheng
,
Zhongkai Zhao
,
Xuanlei Zhao
,
James Demmel
,
Yang You
WallFacer: Guiding Transformer Model Training Out of the Long-Context Dark Forest with N-body Problem.
CoRR
(2024)
Shenggan Cheng
,
Xuanlei Zhao
,
Guangyang Lu
,
Jiarui Fang
,
Tian Zheng
,
Ruidong Wu
,
Xiwen Zhang
,
Jian Peng
,
Yang You
FastFold: Optimizing AlphaFold Training and Inference on GPU Clusters.
PPoPP
(2024)
Xuanlei Zhao
,
Shenggan Cheng
,
Guangyang Lu
,
Haotian Zhou
,
Bin Jia
,
Yang You
AutoChunk: Automated Activation Chunk for Memory-Efficient Deep Learning Inference.
ICLR
(2024)
Xuanlei Zhao
,
Shenggan Cheng
,
Zangwei Zheng
,
Zheming Yang
,
Ziming Liu
,
Yang You
DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers.
CoRR
(2024)
Xuanlei Zhao
,
Bin Jia
,
Haotian Zhou
,
Ziming Liu
,
Shenggan Cheng
,
Yang You
HeteGen: Efficient Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained Devices.
MLSys
(2024)
Xuanlei Zhao
,
Shenggan Cheng
,
Guangyang Lu
,
Jiarui Fang
,
Haotian Zhou
,
Bin Jia
,
Ziming Liu
,
Yang You
AutoChunk: Automated Activation Chunk for Memory-Efficient Long Sequence Inference.
CoRR
(2024)
Xuanlei Zhao
,
Bin Jia
,
Haotian Zhou
,
Ziming Liu
,
Shenggan Cheng
,
Yang You
HeteGen: Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained Devices.
CoRR
(2024)