Login / Signup

Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models.

Steve YadlowskyLyric DoshiNilesh Tripuraneni
Published in: CoRR (2023)
Keyphrases