Login / Signup
Atom: Low-Bit Quantization for Efficient and Accurate LLM Serving.
Yilong Zhao
Chien-Yu Lin
Kan Zhu
Zihao Ye
Lequn Chen
Size Zheng
Luis Ceze
Arvind Krishnamurthy
Tianqi Chen
Baris Kasikci
Published in:
MLSys (2024)
Keyphrases
</>
bit vector
computationally efficient
high quality
computational complexity
computationally expensive
real time
multiscale
digital images
high accuracy
image processing
lightweight
magnetic tape
uniform quantization