1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed.
Hanlin TangShaoduo GanAmmar Ahmad AwanSamyam RajbhandariConglong LiXiangru LianJi LiuCe ZhangYuxiong HePublished in: ICML (2021)
Keyphrases
- convergence speed
- convergence rate
- particle swarm optimization algorithm
- particle swarm optimization
- step size
- differential evolution
- global search
- pso algorithm
- global convergence
- learning rate
- firefly algorithm
- real time
- strong robustness
- training speed
- ant colony optimization algorithm
- population diversity
- faster convergence
- genetic programming