Thorough Characterization and Analysis of Large Transformer Model Training At-Scale.
Scott ChengJun-Liang LinMurali EmaniSiddhisanket RaskarSam ForemanZhen XieVenkatram VishwanathMahmut Taylan KandemirPublished in: Proc. ACM Meas. Anal. Comput. Syst. (2024)
Keyphrases
- probabilistic model
- parameter estimation
- statistical analysis
- data mining
- mathematical model
- training examples
- cost function
- image analysis
- high level
- theoretical analysis
- knowledge base
- machine learning
- computational model
- theoretical framework
- data sets
- conceptual model
- neural network model
- formal model
- training algorithm
- empirical data