Login / Signup

Tokenizer Choice For LLM Training: Negligible or Crucial?

Mehdi AliMichael FrommKlaudia ThellmannRichard RutmannMax LübberingJohannes LevelingKatrin KlugJan EbertNiclas DollJasper Schulze BuschhoffCharvi JainAlexander Arno WeberLena JurkschatHammam AbdelwahabChelsea JohnPedro Ortiz SuarezMalte OstendorffSamuel WeinbachRafet SifaStefan KesselheimNicolas Flores-Herr
Published in: CoRR (2023)
Keyphrases
  • training phase
  • wide range
  • training set
  • supervised learning
  • training samples
  • test set
  • real time
  • machine learning
  • computer vision
  • web services
  • optimal solution
  • training examples
  • training process
  • training sessions