Chinese WPLC: A Chinese Dataset for Evaluating Pretrained Language Models on Word Prediction Given Long-Range Context.
Huibin GeChenxi SunDeyi XiongQun LiuPublished in: EMNLP (1) (2021)
Keyphrases
- long range
- language model
- word segmentation
- n gram
- language modeling
- short range
- probabilistic model
- context sensitive
- translation model
- out of vocabulary
- information retrieval
- statistical language modeling
- prediction accuracy
- document retrieval
- statistical language models
- word error rate
- conditional random fields
- test collection
- retrieval model
- long range interactions
- speech recognition
- query expansion
- language modelling
- text classification
- language models for information retrieval
- query terms
- long range correlations
- context dependent
- text mining
- information extraction
- keywords