Login / Signup
Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism.
Xupeng Miao
Yujie Wang
Youhe Jiang
Chunan Shi
Xiaonan Nie
Hailin Zhang
Bin Cui
Published in:
CoRR (2022)
Keyphrases
</>
parallel processing
parallel architectures
computational power
online learning
test set
learning algorithm
artificial intelligence
database systems
supervised learning
general purpose
lightweight
fault diagnosis
semi automatic
cost effective
computationally expensive
shared memory