A 7-nm Four-Core Mixed-Precision AI Chip With 26.2-TFLOPS Hybrid-FP8 Training, 104.9-TOPS INT4 Inference, and Workload-Aware Throttling.
Sae Kyu LeeAnkur AgrawalJoel SilbermanMatthew M. ZieglerMingu KangSwagath VenkataramaniNianzheng CaoBruce M. FleischerMichael GuillornMatthew CohenSilvia M. MuellerJinwook OhMartin LutzJinwook JungSiyu KoswattaChing ZhouVidhi ZalaniMonodeep KarJames BonannoRobert CasatutaChia-Yu ChenJungwook ChoiHoward HaynieAlyssa HerbertRadhika JainKyu-Hyoun KimYulong LiZhibin RenScot RiderMarcel SchaalKerstin SchelmMichael ScheuermannXiao SunHung TranNaigang WangWei WangXin ZhangVinay ShahBrian W. CurranVijayalakshmi SrinivasanPong-Fei LuSunil ShuklaKailash GopalakrishnanLeland ChangPublished in: IEEE J. Solid State Circuits (2022)
Keyphrases
- structured prediction
- physical design
- expert systems
- artificial intelligence
- response time
- supervised learning
- low cost
- intelligent systems
- cmos technology
- training set
- knowledge representation
- training process
- high speed
- training samples
- high precision
- machine learning
- training examples
- case based reasoning
- bayesian networks
- belief networks
- bayesian inference
- computational intelligence
- training phase
- neural network
- programmable logic