• search
    search
  • reviewers
    reviewers
  • feeds
    feeds
  • assignments
    assignments
  • settings
  • logout

DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and DeepSpeed-Inference.

Connor HolmesMasahiro TanakaMichael WyattAmmar Ahmad AwanJeff RasleySamyam RajbhandariReza Yazdani AminabadiHeyang QinArash BakhtiariLev KurilenkoYuxiong He
Published in: CoRR (2024)
Keyphrases