Are the Best Multilingual Document Embeddings simply Based on Sentence Embeddings?
Sonal SannigrahiJosef van GenabithCristina España-BonetPublished in: CoRR (2023)
Keyphrases
- vector space
- low dimensional
- euclidean space
- manifold learning
- dimensionality reduction
- distance measure
- document set
- automatic text summarization
- document images
- text documents
- information retrieval
- vector space model
- multilingual information retrieval
- text generation
- automatic summarization
- text corpus
- document level
- binary codes
- document clustering
- web documents
- information retrieval systems
- digital libraries