COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining.
Yu MengChenyan XiongPayal BajajSaurabh TiwaryPaul BennettJiawei HanXia SongPublished in: CoRR (2021)
Keyphrases
- language model
- information retrieval
- language modeling
- document retrieval
- document level
- text retrieval
- n gram
- probabilistic model
- speech recognition
- query expansion
- language modelling
- multiword
- hidden markov models
- retrieval model
- ad hoc information retrieval
- context sensitive
- text documents
- mixture model
- test collection
- document representation
- pseudo relevance feedback
- relevance model
- web documents
- text mining
- statistical language models
- information extraction
- query terms
- vector space model
- textual content
- document ranking
- term dependencies