Login / Signup

Unpacking Tokenization: Evaluating Text Compression and its Correlation with Model Performance.

Omer GoldmanAvi CaciularuMatan EyalKris CaoIdan SzpektorReut Tsarfaty
Published in: CoRR (2024)
Keyphrases
  • high level
  • probabilistic model
  • information extraction
  • computational model
  • neural network
  • computer vision
  • probability distribution
  • statistical model
  • formal model