ZeRO: Memory Optimization Towards Training A Trillion Parameter Models.
Samyam RajbhandariJeff RasleyOlatunji RuwaseYuxiong HePublished in: CoRR (2019)
Keyphrases
- experimental data
- complex systems
- memory usage
- statistical models
- optimization algorithm
- constrained optimization
- optimization method
- training samples
- probabilistic model
- online learning
- real time
- optimization problems
- support vector machine
- information retrieval
- parameter space
- training process
- machine learning
- multi layer perceptron
- neural network