Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism.

Published in: Proc. VLDB Endow. (2022)

Keyphrases