Mixture of Scales: Memory-Efficient Token-Adaptive Binarization for Large Language Models.
Dongwon JoTaesu KimYulhwa KimJae-Joon KimPublished in: CoRR (2024)
Keyphrases
- language model
- memory efficient
- mixture model
- language modeling
- n gram
- document retrieval
- probabilistic model
- speech recognition
- information retrieval
- language modelling
- retrieval model
- query expansion
- statistical language models
- smoothing methods
- query terms
- spoken term detection
- vector space model
- context sensitive
- test collection
- document ranking
- document length
- relevance model
- translation model
- query specific
- retrieval effectiveness
- document images
- ad hoc information retrieval
- expectation maximization