Improving document clustering using automated machine translation.
Xiang WangBuyue QianIan DavidsonPublished in: CIKM (2012)
Keyphrases
- machine translation
- document clustering
- cross lingual
- text mining
- information extraction
- natural language processing
- document collections
- language independent
- text clustering
- clustering method
- clustering algorithm
- document representation
- parallel corpus
- text documents
- natural language
- cross language information retrieval
- statistical machine translation
- target language
- vector space model
- machine translation system
- k means
- tf idf
- word alignment
- data mining
- wordnet
- pairwise constraints
- machine learning
- query translation
- information retrieval
- search engine
- feature selection