Multistage Collaborative Knowledge Distillation from Large Language Models.
Jiachen ZhaoWenlong ZhaoAndrew DrozdovBenjamin RozonoyerMd. Arafat SultanJay-Yoon LeeMohit IyyerAndrew McCallumPublished in: CoRR (2023)
Keyphrases
- multistage
- language model
- language modeling
- n gram
- production system
- probabilistic model
- single stage
- dynamic programming
- statistical language models
- speech recognition
- lot sizing
- stochastic programming
- context sensitive
- document retrieval
- vector space model
- retrieval model
- language modelling
- information retrieval
- query expansion
- test collection
- query terms
- optimal policy
- learning algorithm
- language models for information retrieval
- error rate
- web search
- document ranking
- knowledge discovery
- search engine
- smoothing methods
- data mining