Login / Signup
SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models.
Yu Yang
Siddhartha Mishra
Jeffrey N. Chiang
Baharan Mirzasoleiman
Published in:
CoRR (2024)
Keyphrases
</>
fine tuning
language model
probabilistic model
language modeling
training data
speech recognition
document retrieval
statistical language models
knowledge discovery
labeled data
n gram
statistical models
retrieval model
language modelling