Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer Quantization.
Jeonghoon KimJung Hyun LeeSungdong KimJoonsuk ParkKang Min YooSe Jung KwonDongsoo LeePublished in: NeurIPS (2023)
Keyphrases
- memory efficient
- fine tuning
- language model
- uniform quantization
- language modeling
- information retrieval
- n gram
- viable alternative
- query expansion
- document retrieval
- speech recognition
- retrieval model
- language modelling
- external memory
- probabilistic model
- test collection
- smoothing methods
- statistical language models
- context sensitive
- data structure
- language model for information retrieval
- translation model
- ad hoc information retrieval
- language models for information retrieval
- pseudo relevance feedback
- fine tuned
- document length
- relevance model
- okapi bm
- query terms
- retrieval effectiveness
- query processing
- search engine