H3T: Efficient Integration of Memory Optimization and Parallelism for Large-scale Transformer Training.
Yuzhong WangXu HanWeilin ZhaoGuoyang ZengZhiyuan LiuMaosong SunPublished in: NeurIPS (2023)
Keyphrases
- limited memory
- real world
- online learning
- neural network
- fuzzy logic
- data integration
- training algorithm
- global optimization
- parallel execution
- data sets
- computational power
- parallel processing
- combinatorial optimization
- optimization method
- power system
- particle swarm optimization
- supervised learning
- support vector machine