Are the Best Multilingual Document Embeddings simply Based on Sentence Embeddings?
Sonal SannigrahiJosef van GenabithCristina España-BonetPublished in: EACL (Findings) (2023)
Keyphrases
- vector space
- information retrieval
- low dimensional
- distance measure
- automatic summarization
- dimensionality reduction
- information retrieval systems
- manifold learning
- natural language
- digital libraries
- euclidean space
- document images
- word frequency
- text generation
- document level
- document retrieval
- text documents
- multilingual information retrieval
- single document summarization
- noun phrases
- text summarization
- retrieval systems
- test collection
- co occurrence
- keywords