​
Login / Signup
Qianchao Zhu
ORCID
Publication Activity (10 Years)
Years Active: 2021-2024
Publications (10 Years): 6
Top Topics
Massively Parallel
Coarse Grained
Matrix Valued
Deep Learning
Top Venues
ICPP
Proc. ACM Program. Lang.
ASPLOS (3)
CoRR
</>
Publications
</>
Qianchao Zhu
FreeStencil: A Fine-Grained Solver Compiler with Graph and Kernel Optimizations on Structured Meshes for Modern GPUs.
ICPP
(2024)
Qianchao Zhu
,
Jiangfei Duan
,
Chang Chen
,
Siran Liu
,
Xiuhong Li
,
Guanyu Feng
,
Xin Lv
,
Huanqi Cao
,
Xiao Chuanfu
,
Xingcheng Zhang
,
Dahua Lin
,
Chao Yang
SampleAttention: Near-Lossless Acceleration of Long Context LLM Inference with Adaptive Structured Sparse Attention.
CoRR
(2024)
Chang Chen
,
Xiuhong Li
,
Qianchao Zhu
,
Jiangfei Duan
,
Peng Sun
,
Xingcheng Zhang
,
Chao Yang
Centauri: Enabling Efficient Scheduling for Communication-Computation Overlap in Large Model Training via Communication Partitioning.
ASPLOS (3)
(2024)
Huanqi Cao
,
Shizhi Tang
,
Qianchao Zhu
,
Bowen Yu
,
Wenguang Chen
Mat2Stencil: A Modular Matrix-Based DSL for Explicit and Implicit Matrix-Free PDE Solvers on Structured Grid.
Proc. ACM Program. Lang.
7 (OOPSLA2) (2023)
Lijuan Jiang
,
Ping Xu
,
Qianchao Zhu
,
Xiuhong Li
,
Shengen Yan
,
Xingcheng Zhang
,
Dahua Lin
,
Wenjing Ma
,
Zhouyang Li
,
Jun Liu
,
Jinming Ma
,
Minxi Jin
,
Chao Yang
EasyView: Enabling and Scheduling Tensor Views in Deep Learning Compilers.
ICPP
(2022)
Qianchao Zhu
,
Hao Luo
,
Chao Yang
,
Mingshuo Ding
,
Wanwang Yin
,
Xinhui Yuan
Enabling and scaling the HPCG benchmark on the newest generation Sunway supercomputer with 42 million heterogeneous cores.
SC
(2021)