Knowledge Transfer from Pre-trained Language Models to Cif-based Speech Recognizers via Hierarchical Distillation.
Minglun HanFeilong ChenJing ShiShuang XuBo XuPublished in: INTERSPEECH (2023)
Keyphrases
- language model
- knowledge transfer
- pre trained
- speech recognition
- language modeling
- n gram
- knowledge sharing
- probabilistic model
- document retrieval
- information retrieval
- transfer learning
- test collection
- query expansion
- retrieval model
- training data
- training examples
- query terms
- data sets
- cross lingual
- neural network
- natural language processing
- text categorization
- labeled data
- decision trees