Login / Signup
Improving Computation and Memory Efficiency for Real-world Transformer Inference on GPUs.
Jiangsu Du
Jiazhi Jiang
Jiang Zheng
Hongbin Zhang
Dan Huang
Yutong Lu
Published in:
ACM Trans. Archit. Code Optim. (2023)
Keyphrases
</>
real world
memory usage
wide range
fuzzy logic
memory requirements
general purpose
computational power
bayesian inference
synthetic data
main memory
inference process
bayesian networks
data mining
neural network
case study
memory space
data sets