Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt.
Zhaozhuo XuZirui LiuBeidi ChenYuxin TangJue WangKaixiong ZhouXia HuAnshumali ShrivastavaPublished in: CoRR (2023)
Keyphrases
- trade off
- computational efficiency
- computational complexity
- high accuracy
- bias variance
- prediction accuracy
- accuracy rate
- bayesian networks
- computational cost
- classification accuracy
- database
- success rate
- social networks
- artificial intelligence
- precision and recall
- data compression
- highly efficient
- inference process
- machine learning