Login / Signup
Chimera: A Lossless Decoding Method for Accelerating Large Language Models Inference by Fusing all Tokens.
Ziqian Zeng
Jiahong Yu
Qianshi Pang
Zihao Wang
Huiping Zhuang
Cen Chen
Published in:
CoRR (2024)
Keyphrases
</>
language model
probabilistic model
information retrieval
decision trees
classification accuracy
error rate
document retrieval
relevance model
language modelling
clustering algorithm
query expansion