Login / Signup
Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge.
Xuan Shen
Peiyan Dong
Lei Lu
Zhenglun Kong
Zhengang Li
Ming Lin
Chao Wu
Yanzhi Wang
Published in:
CoRR (2023)
Keyphrases
</>
bayesian networks
quantization error
edge detection
software development
information processing
probabilistic inference
belief networks
bayesian inference
weighted graph
edge information
inference process
data sets
development process
supply chain management
memory efficient
inference mechanism