Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes.
Cheng-Yu HsiehChun-Liang LiChih-Kuan YehHootan NakhostYasuhisa FujiiAlex RatnerRanjay KrishnaChen-Yu LeeTomas PfisterPublished in: ACL (Findings) (2023)