PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM Inference.
Dongjie YangXiaodong HanYan GaoYao HuShilin ZhangHai ZhaoPublished in: CoRR (2024)
Keyphrases
- high throughput
- genome wide
- microarray
- systems biology
- biological data
- genomic data
- dna sequencing
- bayesian networks
- low latency
- multiresolution
- data acquisition
- bayesian inference
- protein protein interactions
- mass spectrometry
- machine learning
- living cells
- query processing
- input image
- microarray images
- mass spectrometry data
- real time
- analysis of gene expression