AMBERT: A Pre-trained Language Model with Multi-Grained Tokenization.
Xinsong ZhangPengshuai LiHang LiPublished in: ACL/IJCNLP (Findings) (2021)
Keyphrases
- language model
- pre trained
- n gram
- language modeling
- document retrieval
- information retrieval
- speech recognition
- probabilistic model
- retrieval model
- training data
- query expansion
- training examples
- test collection
- ad hoc information retrieval
- mixture model
- context sensitive
- pseudo relevance feedback
- relevance model
- neural network
- named entities
- query terms
- hidden markov models
- translation model
- smoothing methods
- feature space
- dirichlet prior