NextLevelBERT: Masked Language Modeling with Higher-Level Representations for Long Documents.
Tamara CzinczollChristoph HönesMaximilian SchallGerard de MeloPublished in: ACL (1) (2024)
Keyphrases
- language modeling
- higher level
- information retrieval
- language model
- expert finding
- language modeling approaches
- retrieval model
- improvements in retrieval effectiveness
- document retrieval
- document length
- trec collections
- query expansion
- vector space model
- term weighting
- relevance model
- low level
- information retrieval systems
- cross lingual
- pseudo feedback
- n gram
- language modeling framework
- ad hoc information retrieval
- relevant documents
- query terms
- document collections
- term dependencies
- probabilistic model
- statistical language models
- document representation
- term weighting schemes
- document ranking
- retrieval systems
- comparable corpora
- web documents
- high level
- test collection
- retrieval effectiveness
- retrieved documents
- term frequency
- co occurrence
- relevance feedback
- metadata
- word segmentation
- expert search
- smoothing methods
- document clustering
- text documents
- text classification
- information extraction
- statistical language modeling
- search engine