From Bilingual Dictionaries to Interlingual Document Representations.
Jagadeesh JagarlamudiHal Daumé IIIRaghavendra UdupaPublished in: ACL (2) (2011)
Keyphrases
- document representation
- bilingual dictionaries
- machine translation
- comparable corpora
- multiword
- text documents
- cross lingual
- parallel corpora
- cross language information retrieval
- language model
- bag of words
- language modeling
- document collections
- vector space model
- data fusion
- document clustering
- information extraction
- vector space
- web documents
- translation model
- natural language processing
- query translation
- text mining
- semantic information
- news articles
- natural language
- text data
- machine translation system
- wordnet
- cross language
- information retrieval
- keywords
- text categorization
- image classification
- background knowledge
- text classification
- link structure
- anchor text
- word pairs
- statistical machine translation
- data mining
- word sense disambiguation
- bayesian networks
- query processing
- information retrieval systems
- test collection
- named entities
- document retrieval