Login / Signup

Chimera: A Lossless Decoding Method for Accelerating Large Language Models Inference by Fusing all Tokens.

Ziqian ZengJiahong YuQianshi PangZihao WangHuiping ZhuangCen Chen
Published in: CoRR (2024)
Keyphrases
  • language model
  • probabilistic model
  • information retrieval
  • decision trees
  • classification accuracy
  • error rate
  • document retrieval
  • relevance model
  • language modelling
  • clustering algorithm
  • query expansion