Tender: Accelerating Large Language Models via Tensor Decomposition and Runtime Requantization.
Jungi LeeWonbeom LeeJaewoong SimPublished in: CoRR (2024)
Keyphrases
- language model
- tensor decomposition
- auxiliary information
- data representation
- language modeling
- high order
- n gram
- information retrieval
- document retrieval
- test collection
- probabilistic model
- query expansion
- low rank
- tensor factorization
- retrieval model
- vector space model
- visual data
- translation model
- missing data
- knn
- active learning
- user feedback
- prior knowledge
- multimedia
- search engine