"Alignment is All You Need": Analyzing Cross-Lingual Document Similarity for Domain-Specific Applications.
Sourav DuttaPublished in: CLEOPATRA@WWW (2021)
Keyphrases
- cross lingual
- document similarity
- document clustering
- domain specific
- document representation
- language modeling
- machine translation
- relevance model
- text classification
- news articles
- clustering algorithm
- text mining
- semantic similarity
- document collections
- language model
- information retrieval
- vector space model
- clustering method
- latent dirichlet allocation
- cluster analysis
- web documents
- tf idf