Lory: Fully Differentiable Mixture-of-Experts for Autoregressive Language Model Pre-training.
Zexuan ZhongMengzhou XiaDanqi ChenMike LewisPublished in: CoRR (2024)
Keyphrases
- language model
- autoregressive
- mixture model
- language modeling
- probabilistic model
- n gram
- document retrieval
- non stationary
- random fields
- speech recognition
- information retrieval
- query expansion
- test collection
- gaussian markov random field
- ad hoc information retrieval
- context sensitive
- retrieval model
- expert finding
- smoothing methods
- translation model
- sar images
- relevance model
- conditional random fields
- machine learning
- multiscale
- word clouds