A 28nm 276.55TFLOPS/W Sparse Deep-Neural-Network Training Processor with Implicit Redundancy Speculation and Batch Normalization Reformulation.
Yang WangYubin QinDazheng DengJingchuan WeiTianbao ChenXinhan LinLeibo LiuShaojun WeiShouyi YinPublished in: VLSI Circuits (2021)