Zhiquan Lai

Publication Activity (10 Years)

Years Active: 2012-2024
Publications (10 Years): 43

Top Topics

Distributed Heterogeneous

Graph Representation

Training Process

Top Venues

IEEE Trans. Parallel Distributed Syst.

HPCC/DSS/SmartCity/DependSys

Publications

Chongshan Liang, Yi Dai, Jun Xia, Jinbo Xu, Jintao Peng, Weixia Xu, Ming Xie, Jie Liu, Zhiquan Lai, Sheng Ma, Qi Zhu
The Self-adaptive and Topology-aware MPI_Bcast leveraging Collective offload on Tianhe Express Interconnect. IPDPS (2024)
Ning Liu, Songlei Jian, Dongsheng Li, Yiming Zhang, Zhiquan Lai, Hongzuo Xu
Hierarchical Adaptive Pooling by Capturing High-order Dependency for Graph Representation Learning (Extended Abstract). ICDE (2024)
Dongsheng Li, Shengwei Li, Zhiquan Lai, Yongquan Fu, Xiangyu Ye, Lei Cai, Linbo Qiao
A Memory-Efficient Hybrid Parallel Framework for Deep Neural Network Training. IEEE Trans. Parallel Distributed Syst. 35 (4) (2024)
Ao Shen, Qiang Wang, Zhiquan Lai, Xionglve Li, Dong-sheng Li
Accurate and Efficient Fine-Tuning of Quantized Large Language Models Through Optimal Balance. CoRR (2024)
Shengwei Li, Kai Lu, Zhiquan Lai, Weijie Liu, Keshi Ge, Dong Sheng Li
A Multidimensional Communication Scheduling Method for Hybrid Parallel DNN Training. IEEE Trans. Parallel Distributed Syst. 35 (8) (2024)
Zhiquan Lai, Shengwei Li, Xudong Tang, Keshi Ge, Weijie Liu, Yabo Duan, Linbo Qiao, Dongsheng Li
Merak: An Efficient Distributed DNN Training Framework With Automated 3D Parallelism for Giant Foundation Models. IEEE Trans. Parallel Distributed Syst. 34 (5) (2023)
Wei Wang, Zhiquan Lai, Shengwei Li, Weijie Liu, Keshi Ge, Yujie Liu, Ao Shen, Dongsheng Li
Prophet: Fine-grained Load Balancing for Parallel Training of Large-scale MoE Models. CLUSTER (2023)
Keshi Ge, Kai Lu, Yongquan Fu, Xiaoge Deng, Zhiquan Lai, Dongsheng Li
Compressed Collective Sparse-Sketch for Distributed Data-Parallel Training of Deep Learning Models. IEEE J. Sel. Areas Commun. 41 (4) (2023)
Lizhi Zhang, Kai Lu, Zhiquan Lai, Yongquan Fu, Yu Tang, Dongsheng Li
Accelerating GNN Training by Adapting Large Graphs to Distributed Heterogeneous Architectures. IEEE Trans. Computers 72 (12) (2023)
Zhiquan Lai, Yanqi Hao, Shengwei Li, Dongsheng Li
Communication Analysis for Multidimensional Parallel Training of Large-scale DNN Models. HPCC/DSS/SmartCity/DependSys (2023)
Hongyu Chen, Zhejiang Ran, Keshi Ge, Zhiquan Lai, Jingfei Jiang, Dongsheng Li
Auto-Divide GNN: Accelerating GNN Training with Subgraph Division. Euro-Par (2023)
Peng Liang, Yu Tang, Xiaoda Zhang, Youhui Bai, Teng Su, Zhiquan Lai, Linbo Qiao, Dongsheng Li
A Survey on Auto-Parallelism of Large-Scale Deep Learning Training. IEEE Trans. Parallel Distributed Syst. 34 (8) (2023)
Shengwei Li, Zhiquan Lai, Yanqi Hao, Weijie Liu, Keshi Ge, Xiaoge Deng, Dongsheng Li, Kai Lu
Automated Tensor Model Parallelism with Overlapped Communication for Efficient Foundation Model Training. CoRR (2023)
Zhiquan Lai, Yujie Liu, Wei Wang, Yanqi Hao, Dongsheng Li
Rethinking the Distributed DNN Training Cluster Design from the Cost-effectiveness View. HPCC/DSS/SmartCity/DependSys (2023)
Yujie Liu, Zhiquan Lai, Weijie Liu, Wei Wang, Dongsheng Li
Efficient Large Models Fine-tuning on Commodity Servers via Memory-balanced Pipeline Parallelism. HPCC/DSS/SmartCity/DependSys (2023)
Ning Liu, Songlei Jian, Dongsheng Li, Yiming Zhang, Zhiquan Lai, Hongzuo Xu
Hierarchical Adaptive Pooling by Capturing High-Order Dependency for Graph Representation Learning. IEEE Trans. Knowl. Data Eng. 35 (4) (2023)
Yuanyuan Xiao, Zhiquan Lai, Dongsheng Li
CD-Sched: An Automated Scheduling Framework for Accelerating Neural Network Training on Shared Memory CPU-DSP Platforms. PCCNT (2023)
Shengwei Li, Zhiquan Lai, Dongsheng Li, Yiming Zhang, Xiangyu Ye, Yabo Duan
EmbRace: Accelerating Sparse Communication for Distributed Training of Deep Neural Networks. ICPP (2022)
Keshi Ge, Zhejiang Ran, Zhiquan Lai, Lizhi Zhang, Dongsheng Li
BRGraph: An efficient graph neural network training system by reusing batch data on GPU. Concurr. Comput. Pract. Exp. 34 (15) (2022)
Zhiquan Lai, Shengwei Li, Xudong Tang, Keshi Ge, Weijie Liu, Yabo Duan, Linbo Qiao, Dongsheng Li
Merak: An Efficient Distributed DNN Training Framework with Automated 3D Parallelism for Giant Foundation Models. CoRR (2022)
Weijie Liu, Zhiquan Lai, Shengwei Li, Yabo Duan, Keshi Ge, Dongsheng Li
AutoPipe: A Fast Pipeline Parallelism Approach with Balanced Partitioning and Micro-batch Slicing. CLUSTER (2022)
Yu Tang, Chenyu Wang, Yufan Zhang, Yuliang Liu, Xingcheng Zhang, Linbo Qiao, Zhiquan Lai, Dongsheng Li
DELTA: Dynamically Optimizing GPU Memory beyond Tensor Recomputation. CoRR (2022)
Yuqi He, Zhiquan Lai, Zhejiang Ran, Lizhi Zhang, Dongsheng Li
Accelerating Sample-based GNN Training by Feature Caching on GPUs. SmartCloud (2022)
Yabo Duan, Zhiquan Lai, Shengwei Li, Weijie Liu, Keshi Ge, Peng Liang, Dongsheng Li
HPH: Hybrid Parallelism on Heterogeneous Clusters for Accelerating Large-scale DNNs Training. CLUSTER (2022)
Yuqi He, Zhiquan Lai, Zhejiang Ran, Lizhi Zhang, Dongsheng Li
SCGraph: Accelerating Sample-based GNN Training by Staged Caching of Features on GPUs. ISPA/BDCloud/SocialCom/SustainCom (2022)
Keshi Ge, Yongquan Fu, Yiming Zhang, Zhiquan Lai, Xiaoge Deng, Dongsheng Li
S2 Reducer: High-Performance Sparse Communication to Accelerate Distributed Deep Learning. ICASSP (2022)
Lizhi Zhang, Zhiquan Lai, Shengwei Li, Yu Tang, Feng Liu, Dongsheng Li
2PGraph: Accelerating GNN Training over Large Graphs on GPU Clusters. CLUSTER (2021)
Shengwei Li, Zhiquan Lai, Dongsheng Li, Xiangyu Ye, Yabo Duan
EmbRace: Accelerating Sparse Communication for Distributed Training of NLP Neural Networks. CoRR (2021)
Zhejiang Ran, Zhiquan Lai, Lizhi Zhang, Dongsheng Li
Accelerate Graph Neural Network Training by Reusing Batch Data on GPUs. IPCCC (2021)
Keshi Ge, Yongquan Fu, Zhiquan Lai, Xiaoge Deng, Dongsheng Li
S2 Reducer: High-Performance Sparse Communication to Accelerate Distributed Deep Learning. CoRR (2021)
Lizhi Zhang, Zhiquan Lai, Yu Tang, Dongsheng Li, Feng Liu, Xiaochun Luo
PCGraph: Accelerating GNN Inference on Large Graphs via Partition Caching. ISPA/BDCloud/SocialCom/SustainCom (2021)
Xiangyu Ye, Zhiquan Lai, Shengwei Li, Lei Cai, Ding Sun, Linbo Qiao, Dongsheng Li
Hippie: A Data-Paralleled Pipeline Approach to Improve Memory-Efficiency and Scalability for Large DNN Training. ICPP (2021)
Dongsheng Li, Zhiyao Hu, Zhiquan Lai, Yiming Zhang, Kai Lu
Coordinative Scheduling of Computation and Communication in Data-Parallel Systems. IEEE Trans. Computers 70 (12) (2021)
Keshi Ge, Yiming Zhang, Yongquan Fu, Zhiquan Lai, Xiaoge Deng, Dongsheng Li
CASQ: Accelerate Distributed Deep Learning with Sketch-Based Gradient Quantization. CLUSTER (2021)
Xiangyu Ye, Zhiquan Lai, Dongsheng Li
Prediction of the Cyanobacteria Coverage in Time-series Images based on Convolutional Neural Network. ICCCV (2021)
Yuetong Yang, Zhiquan Lai, Lei Cai, Dongsheng Li
HMA: An Efficient Training Method for NLP Models. ICIAI (2021)
Ning Liu, Songlei Jian, Dongsheng Li, Yiming Zhang, Zhiquan Lai, Hongzuo Xu
Hierarchical Adaptive Pooling by Capturing High-order Dependency for Graph Representation Learning. CoRR (2021)
Yuetong Yang, Zhiquan Lai, Lei Cai, Dongsheng Li
Poster Abstract: Model Average-based Distributed Training for Sparse Deep Neural Networks. INFOCOM Workshops (2020)
Yu Tang, Zhigang Kan, Dequan Sun, Linbo Qiao, Jingjing Xiao, Zhiquan Lai, Dongsheng Li
ADMMiRNN: Training RNN with Stable Convergence via An Efficient ADMM Approach. CoRR (2020)
Yu Tang, Zhigang Kan, Dequan Sun, Linbo Qiao, Jingjing Xiao, Zhiquan Lai, Dongsheng Li
ADMMiRNN: Training RNN with Stable Convergence via an Efficient ADMM Approach. ECML/PKDD (2) (2020)
Dongsheng Li, Zhiquan Lai, Keshi Ge, Yiming Zhang, Zhaoning Zhang, Qinglin Wang, Huaimin Wang
HPDL: Towards a General Framework for High-performance Distributed Deep Learning. ICDCS (2019)
Zhiquan Lai, King Tin Lam, Cho-Li Wang, Jinshu Su
PoweRock: Power Modeling and Flexible Dynamic Power Management for Many-Core Architectures. IEEE Syst. J. 11 (2) (2017)
Yan Zhu, Guidong Zhang, Zhiquan Lai, Boya Niu, Yongjun Shen
A Two-Tiered Defence of Techniques to Prevent SQL Injection Attacks. IMIS (2017)
Zhiquan Lai, King Tin Lam, Cho-Li Wang, Jinshu Su
Latency-aware DVFS for efficient power state transitions on many-core architectures. J. Supercomput. 71 (7) (2015)
King Tin Lam, Jinghao Shi, Dominic Hung, Cho-Li Wang, Zhiquan Lai, Wangbin Zhu, Youliang Yan
Rhymes: A shared virtual memory system for non-coherent tiled many-core architectures. ICPADS (2014)
Zhiquan Lai, Baokang Zhao, Jinshu Su
Efficient DVFS to Prevent Hard Faults for Many-Core Architectures. ICT-EurAsia (2014)
Zhiquan Lai, King Tin Lam, Cho-Li Wang, Jinshu Su
A Power Modelling Approach for Many-Core Architectures. SKG (2014)
Lin-Bo Qiao, Bo-Feng Zhang, Zhiquan Lai, Jinshu Su
Mining of Attack Models in IDS Alerts from Network Backbone by a Two-stage Clustering Method. IPDPS Workshops (2012)