Sign in
Sheng Li
ORCID
Publication Activity (10 Years)
Years Active: 2007-2023
Publications (10 Years): 23
Top Topics
Arrival Times
Ibm Sp
Disk Accesses
Machine Learning Systems
Top Venues
CoRR
ISCA
IEEE Micro
ACM Trans. Comput. Syst.
</>
Publications
</>
Jordan Dotzel
,
Gang Wu
,
Andrew Li
,
Muhammad Umar
,
Yun Ni
,
Mohamed S. Abdelfattah
,
Zhiru Zhang
,
Liqun Cheng
,
Martin G. Dixon
,
Norman P. Jouppi
,
Quoc V. Le
,
Sheng Li
FLIQS: One-Shot Mixed-Precision Floating-Point and Integer Quantization Search.
CoRR
(2023)
Hong Liu
,
Ryohei Urata
,
Kevin Yasumura
,
Xiang Zhou
,
Roy Bannon
,
Jill Berger
,
Pedram Dashti
,
Norm Jouppi
,
Cedric F. Lam
,
Sheng Li
,
Erji Mao
,
Daniel Nelson
,
George Papen
,
Muhammad Mukarram Bin Tariq
,
Amin Vahdat
Lightwave Fabrics: At-Scale Optical Circuit Switching for Datacenter and Machine Learning Systems.
SIGCOMM
(2023)
Norman P. Jouppi
,
George Kurian
,
Sheng Li
,
Peter C. Ma
,
Rahul Nagarajan
,
Lifeng Nai
,
Nishant Patil
,
Suvinay Subramanian
,
Andy Swing
,
Brian Towles
,
Cliff Young
,
Xiang Zhou
,
Zongwei Zhou
,
David A. Patterson
TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for Embeddings.
CoRR
(2023)
Yongqi Huang
,
Peng Ye
,
Xiaoshui Huang
,
Sheng Li
,
Tao Chen
,
Tong He
,
Wanli Ouyang
Experts Weights Averaging: A New General Training Scheme for Vision Transformers.
CoRR
(2023)
Norman P. Jouppi
,
George Kurian
,
Sheng Li
,
Peter C. Ma
,
Rahul Nagarajan
,
Lifeng Nai
,
Nishant Patil
,
Suvinay Subramanian
,
Andy Swing
,
Brian Towles
,
Clif Young
,
Xiang Zhou
,
Zongwei Zhou
,
David A Patterson
TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for Embeddings.
ISCA
(2023)
Cheng Fu
,
Hanxian Huang
,
Zixuan Jiang
,
Yun Ni
,
Lifeng Nai
,
Gang Wu
,
Liqun Cheng
,
Yanqi Zhou
,
Sheng Li
,
Andrew Li
,
Jishen Zhao
TripLe: Revisiting Pretrained Model Reuse and Progressive Learning for Efficient Vision Transformer Scaling and Searching.
ICCV
(2023)
Sheng Li
,
Mingxing Tan
,
Ruoming Pang
,
Andrew Li
,
Liqun Cheng
,
Quoc V. Le
,
Norman P. Jouppi
Searching for Fast Model Families on Datacenter Accelerators.
CVPR
(2021)
Sheng Li
,
Mingxing Tan
,
Ruoming Pang
,
Andrew Li
,
Liqun Cheng
,
Quoc V. Le
,
Norman P. Jouppi
Searching for Fast Model Families on Datacenter Accelerators.
CoRR
(2021)
Thomas Norrie
,
Nishant Patil
,
Doe Hyun Yoon
,
George Kurian
,
Sheng Li
,
James Laudon
,
Cliff Young
,
Norman P. Jouppi
,
David A. Patterson
The Design Process for Google's Training Chips: TPUv2 and TPUv3.
IEEE Micro
41 (2) (2021)
Norman P. Jouppi
,
Doe Hyun Yoon
,
Matthew Ashcraft
,
Mark Gottscho
,
Thomas B. Jablin
,
George Kurian
,
James Laudon
,
Sheng Li
,
Peter C. Ma
,
Xiaoyu Ma
,
Thomas Norrie
,
Nishant Patil
,
Sushma Prasad
,
Cliff Young
,
Zongwei Zhou
,
David A. Patterson
Ten Lessons From Three Generations Shaped Google's TPUv4i : Industrial Product.
ISCA
(2021)
Tianqi Tang
,
Sheng Li
,
Lifeng Nai
,
Norman P. Jouppi
,
Yuan Xie
NeuroMeter: An Integrated Power, Area, and Timing Modeling Framework for Machine Learning Accelerators Industry Track Paper.
HPCA
(2021)
Thomas Norrie
,
Nishant Patil
,
Doe Hyun Yoon
,
George Kurian
,
Sheng Li
,
James Laudon
,
Cliff Young
,
Norman P. Jouppi
,
David A. Patterson
Google's Training Chips Revealed: TPUv2 and TPUv3.
Hot Chips Symposium
(2020)
Norman P. Jouppi
,
Doe Hyun Yoon
,
George Kurian
,
Sheng Li
,
Nishant Patil
,
James Laudon
,
Cliff Young
,
David A. Patterson
A domain-specific supercomputer for training deep neural networks.
Commun. ACM
63 (7) (2020)
Shihao Ji
,
Nadathur Satish
,
Sheng Li
,
Pradeep Dubey
Parallelizing Word2Vec in Shared and Distributed Memory.
IEEE Trans. Parallel Distributed Syst.
30 (9) (2019)
Eojin Lee
,
Jongwook Chung
,
Daejin Jung
,
Sukhan Lee
,
Sheng Li
,
Jung Ho Ahn
Work as a team or individual: Characterizing the system-level impacts of main memory partitioning.
IISWC
(2017)
Sheng Li
,
Hyeontaek Lim
,
Victor W. Lee
,
Jung Ho Ahn
,
Anuj Kalia
,
Michael Kaminsky
,
David G. Andersen
,
Seongil O
,
Sukhan Lee
,
Pradeep Dubey
Full-Stack Architecting to Achieve a Billion-Requests-Per-Second Throughput on a Single Key-Value Store Server Platform.
ACM Trans. Comput. Syst.
34 (2) (2016)
Shihao Ji
,
Nadathur Satish
,
Sheng Li
,
Pradeep Dubey
Parallelizing Word2Vec in Shared and Distributed Memory.
CoRR
(2016)
Sheng Li
,
Hyeontaek Lim
,
Victor W. Lee
,
Jung Ho Ahn
,
Anuj Kalia
,
Michael Kaminsky
,
David G. Andersen
,
Seongil O
,
Sukhan Lee
,
Pradeep Dubey
Achieving One Billion Key-Value Requests per Second on a Single Server.
IEEE Micro
36 (3) (2016)
Daejin Jung
,
Sheng Li
,
Jung Ho Ahn
Large Pages on Steroids: Small Ideas to Accelerate Big Memory Applications.
IEEE Comput. Archit. Lett.
15 (2) (2016)
Shihao Ji
,
Nadathur Satish
,
Sheng Li
,
Pradeep Dubey
Parallelizing Word2Vec in Multi-Core and Many-Core Architectures.
CoRR
(2016)
Jishen Zhao
,
Sheng Li
,
Jichuan Chang
,
John L. Byrne
,
Laura L. Ramirez
,
Kevin T. Lim
,
Yuan Xie
,
Paolo Faraboschi
Buri: Scaling Big-Memory Computing with Hardware-Based Memory Expansion.
ACM Trans. Archit. Code Optim.
12 (3) (2015)
Sheng Li
,
Hyeontaek Lim
,
Victor W. Lee
,
Jung Ho Ahn
,
Anuj Kalia
,
Michael Kaminsky
,
David G. Andersen
,
Seongil O
,
Sukhan Lee
,
Pradeep Dubey
Architecting to achieve a billion requests per second throughput on a single key-value store server platform.
ISCA
(2015)
Ke Chen
,
Sheng Li
,
Jung Ho Ahn
,
Naveen Muralimanohar
,
Jishen Zhao
,
Cong Xu
,
Seongil O
,
Yuan Xie
,
Jay B. Brockman
,
Norman P. Jouppi
History-Assisted Adaptive-Granularity Caches (HAAG$) for High Performance 3D DRAM Architectures.
ICS
(2015)
Jung Ho Ahn
,
Sheng Li
,
Seongil O
,
Norman P. Jouppi
McSimA+: A manycore simulator with application-level+ simulation and detailed microarchitecture modeling.
ISPASS
(2013)
Jishen Zhao
,
Sheng Li
,
Doe Hyun Yoon
,
Yuan Xie
,
Norman P. Jouppi
Kiln: closing the performance gap between systems with and without persistence support.
MICRO
(2013)
Sheng Li
,
Jung Ho Ahn
,
Richard D. Strong
,
Jay B. Brockman
,
Dean M. Tullsen
,
Norman P. Jouppi
The McPAT Framework for Multicore and Manycore Architectures: Simultaneously Modeling Power, Area, and Timing.
ACM Trans. Archit. Code Optim.
10 (1) (2013)
Sheng Li
,
Doe Hyun Yoon
,
Ke Chen
,
Jishen Zhao
,
Jung Ho Ahn
,
Jay B. Brockman
,
Yuan Xie
,
Norman P. Jouppi
MAGE: adaptive granularity and ECC for resilient and power efficient memory systems.
SC
(2012)
Ke Chen
,
Sheng Li
,
Naveen Muralimanohar
,
Jung Ho Ahn
,
Jay B. Brockman
,
Norman P. Jouppi
CACTI-3DD: Architecture-level modeling for 3D die-stacked DRAM main memory.
DATE
(2012)
Sheng Li
,
Kevin T. Lim
,
Paolo Faraboschi
,
Jichuan Chang
,
Parthasarathy Ranganathan
,
Norman P. Jouppi
System-level integrated server architectures for scale-out datacenters.
MICRO
(2011)
Sheng Li
,
Ke Chen
,
Jung Ho Ahn
,
Jay B. Brockman
,
Norman P. Jouppi
CACTI-P: Architecture-level modeling for SRAM-based structures with advanced leakage reduction techniques.
ICCAD
(2011)
Sheng Li
,
Shannon K. Kuntz
,
Jay B. Brockman
,
Peter M. Kogge
Lightweight Chip Multi-Threading (LCMT): Maximizing Fine-Grained Parallelism On-Chip.
IEEE Trans. Parallel Distributed Syst.
22 (7) (2011)
Sheng Li
,
Ke Chen
,
Ming-yu Hsieh
,
Naveen Muralimanohar
,
Chad D. Kersey
,
Jay B. Brockman
,
Arun F. Rodrigues
,
Norman P. Jouppi
System implications of memory reliability in exascale computing.
SC
(2011)
Sheng Li
,
Jung Ho Ahn
,
Richard D. Strong
,
Jay B. Brockman
,
Dean M. Tullsen
,
Norman P. Jouppi
McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures.
MICRO
(2009)
Sheng Li
,
Shannon K. Kuntz
,
Peter M. Kogge
,
Jay B. Brockman
Memory model effects on application performance for a lightweight multithreaded architecture.
IPDPS
(2008)
Jay B. Brockman
,
Sheng Li
,
Peter M. Kogge
,
Amit Kashyap
,
Mohammad M. Mojarradi
Design of a mask-programmable memory/multiplier array using G4-FET technology.
DAC
(2008)
Sheng Li
,
Amit Kashyap
,
Shannon K. Kuntz
,
Jay B. Brockman
,
Peter M. Kogge
,
Paul L. Springer
,
Gary Block
A Heterogeneous Lightweight Multithreaded Architecture.
IPDPS
(2007)