Sign in

Alexa Teacher Model: Pretraining and Distilling Multi-Billion-Parameter Encoders for Natural Language Understanding Systems.

Jack FitzGeraldShankar AnanthakrishnanKonstantine ArkoudasDavide BernardiAbhishek BhagiaClaudio Delli BoviJin CaoRakesh ChadaAmit ChauhanLuoxin ChenAnurag DwarakanathSatyam DwivediTuran GojayevKarthik GopalakrishnanThomas GueudreDilek Hakkani-TurWael HamzaJonathan J. HüserKevin Martin JoseHaidar KhanBeiye LiuJianhua LuAlessandro ManzottiPradeep NatarajanKarolina OwczarzakGokmen OzEnrico PalumboCharith PerisChandana Satya PrakashStephen RawlsAndy RosenbaumAnjali ShenoySaleh SoltanMukund Harakere SridharLiz TanFabian TriefenbachPan WeiHaiyang YuShuai ZhengGökhan TürPrem Natarajan
Published in: CoRR (2022)
Keyphrases
  • natural language understanding
  • computational model
  • machine learning
  • management system
  • parameter values
  • data mining
  • learning process
  • language understanding