C
search
search
reviewers
reviewers
feeds
feeds
assignments
assignments
settings
logout
Improving Computation and Memory Efficiency for Real-world Transformer Inference on GPUs.
Jiangsu Du
Jiazhi Jiang
Jiang Zheng
Hongbin Zhang
Dan Huang
Yutong Lu
Published in:
ACM Trans. Archit. Code Optim. (2023)
Keyphrases
</>
real world
memory usage
wide range
fuzzy logic
memory requirements
general purpose
computational power
bayesian inference
synthetic data
main memory
inference process
bayesian networks
data mining
neural network
case study
memory space
data sets