PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM Inference.
Dongjie YangXiaodong HanYan GaoYao HuShilin ZhangHai ZhaoPublished in: ACL (Findings) (2024)
Keyphrases
- high throughput
- microarray
- genome wide
- biological data
- systems biology
- multiresolution
- data acquisition
- mass spectrometry data
- genomic data
- protein protein interactions
- dna sequencing
- low latency
- flow cytometry
- proteomic data
- bayesian inference
- gene expression
- query processing
- dynamic bayesian networks
- monitoring system
- mass spectrometry
- data model
- living cells
- bayesian networks
- feature extraction
- data sets