MiniALBERT: Model Distillation via Parameter-Efficient Recursive Transformers.
Mohammadmahdi NouriborjiOmid RohanianSamaneh KouchakiDavid A. CliftonPublished in: EACL (2023)
Keyphrases
- computational model
- formal model
- probabilistic model
- similarity measure
- conceptual model
- high level
- expert systems
- experimental data
- mathematical model
- probability distribution
- linear model
- neural network
- parameter values
- closed form
- theoretical framework
- prior knowledge
- evolutionary algorithm
- data structure
- decision trees
- information systems
- learning algorithm
- information retrieval