1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed.
Hanlin TangShaoduo GanAmmar Ahmad AwanSamyam RajbhandariConglong LiXiangru LianJi LiuCe ZhangYuxiong HePublished in: CoRR (2021)
Keyphrases
- convergence speed
- differential evolution
- convergence rate
- firefly algorithm
- global search
- particle swarm optimization
- learning rate
- particle swarm optimization algorithm
- population diversity
- training speed
- pso algorithm
- step size
- search capabilities
- global convergence
- training process
- ant colony optimization algorithm
- faster convergence
- steady state error
- neural network
- training algorithm
- evolutionary algorithm