Exploring Anisotropy and Outliers in Multilingual Language Models for Cross-Lingual Semantic Sentence Similarity.
Katharina HämmerlAlina FastowskiJindrich LibovickýAlexander FraserPublished in: CoRR (2023)
Keyphrases
- cross lingual
- language modeling
- language model
- sentence similarity
- semantic similarity
- cross lingual information retrieval
- cross language
- language independent
- n gram
- translation model
- information retrieval
- probabilistic model
- document retrieval
- retrieval model
- query expansion
- pseudo feedback
- multi document summarization
- parallel corpus
- test collection
- relevance model
- text classification
- query translation
- word segmentation
- vector space model
- semantic information
- pseudo relevance feedback
- query terms
- parallel corpora
- co occurrence
- out of vocabulary
- text retrieval
- information retrieval systems