Unsupervised Boundary-Aware Language Model Pretraining for Chinese Sequence Labeling.
Peijie JiangDingkun LongYanzhao ZhangPengjun XieMeishan ZhangMin ZhangPublished in: EMNLP (2022)
Keyphrases
- language model
- sequence labeling
- conditional random fields
- language modeling
- probabilistic model
- dependency parsing
- n gram
- structured prediction
- named entity recognition
- word segmentation
- document retrieval
- query expansion
- retrieval model
- semi supervised
- test collection
- mixture model
- unsupervised learning
- graphical models
- information retrieval
- generative model
- maximum entropy
- information extraction
- crf model
- context sensitive
- supervised learning
- hidden markov models
- prior knowledge
- translation model
- machine learning
- natural language processing
- active learning
- pairwise
- multiword
- information retrieval systems