Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer Quantization.
Jeonghoon KimJung Hyun LeeSungdong KimJoonsuk ParkKang Min YooSe Jung KwonDongsoo LeePublished in: CoRR (2023)
Keyphrases
- memory efficient
- fine tuning
- language model
- uniform quantization
- language modeling
- n gram
- probabilistic model
- document retrieval
- language modelling
- viable alternative
- speech recognition
- retrieval model
- information retrieval
- query expansion
- test collection
- fine tuned
- external memory
- ad hoc information retrieval
- data structure
- document ranking
- pseudo relevance feedback
- context sensitive
- statistical language models
- vector space model
- smoothing methods
- language models for information retrieval
- document length
- statistical language modeling
- spoken term detection
- query specific
- relevance model
- query terms
- relevant documents