GenDistiller: Distilling Pre-trained Language Models based on an Autoregressive Generative Model.
Yingying GaoShilei ZhangChao DengJunlan FengPublished in: CoRR (2024)
Keyphrases
- autoregressive
- generative model
- language model
- pre trained
- probabilistic model
- random fields
- language modeling
- appearance variations
- non stationary
- training data
- training examples
- n gram
- conditional random fields
- speech recognition
- graphical models
- information retrieval
- em algorithm
- retrieval model
- expectation maximization
- topic models
- latent dirichlet allocation
- bayesian networks
- sar images
- supervised learning
- prior knowledge
- parameter estimation
- active appearance models