A 40nm 4.81TFLOPS/W 8b Floating-Point Training Processor for Non-Sparse Neural Networks Using Shared Exponent Bias and 24-Way Fused Multiply-Add Tree.
Jeongwoo ParkSunwoo LeeDongsuk JeonPublished in: ISSCC (2021)
Keyphrases
- floating point
- instruction set
- neural network
- training process
- floating point arithmetic
- sparse matrices
- training algorithm
- square root
- fixed point
- floating point unit
- multi layer perceptron
- artificial neural networks
- training set
- back propagation
- interval arithmetic
- parallel processing
- graphics processing units
- index structure
- low cost