Sign in

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes.

Cheng-Yu HsiehChun-Liang LiChih-Kuan YehHootan NakhostYasuhisa FujiiAlex RatnerRanjay KrishnaChen-Yu LeeTomas Pfister
Published in: ACL (Findings) (2023)
Keyphrases
  • language model
  • probabilistic model
  • training data
  • statistical model
  • language modeling
  • search engine
  • generative model
  • test collection
  • context sensitive
  • translation model