Login / Signup
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving.
Yujun Lin
Haotian Tang
Shang Yang
Zhekai Zhang
Guangxuan Xiao
Chuang Gan
Song Han
Published in:
CoRR (2024)
Keyphrases
</>
cost effective
computer vision
neural network
social networks
image segmentation
three dimensional
multiscale
motion estimation
computationally expensive