High-throughput Generative Inference of Large Language Models with a Single GPU.
Ying ShengLianmin ZhengBinhang YuanZhuohan LiMax RyabininDaniel Y. FuZhiqiang XieBeidi ChenClark W. BarrettJoseph E. GonzalezPercy LiangChristopher RéIon StoicaCe ZhangPublished in: CoRR (2023)
Keyphrases
- high throughput
- language model
- language modeling
- microarray
- probabilistic model
- n gram
- genome wide
- document retrieval
- biological data
- systems biology
- speech recognition
- retrieval model
- test collection
- query expansion
- generative model
- language modelling
- statistical language models
- information retrieval
- proteomic data
- data acquisition
- mass spectrometry
- smoothing methods
- language models for information retrieval
- protein protein interactions
- bayesian networks
- vector space model
- gene expression
- genomic data
- document ranking
- low cost
- data mining
- real time
- high speed
- pattern recognition
- mass spectrometry data
- language modeling framework