Login / Signup

A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale.

Hao-Jun Michael ShiTsung-Hsien LeeShintaro IwasakiJose Gallego-PosadaZhijing LiKaushik RangaduraiDheevatsa MudigereMichael Rabbat
Published in: CoRR (2023)
Keyphrases