Login / Signup

Measuring Sample Importance in Data Pruning for Training LLMs from a Data Compression Perspective.

Minsang KimSeungjun Baek
Published in: CoRR (2024)
Keyphrases
  • data compression
  • data reduction
  • data sets
  • data analysis
  • image data
  • raw data
  • high quality
  • data sources
  • similarity measure
  • association rules
  • data quality
  • compression algorithm
  • mixed data