Word-wise Script Identification from Bilingual Documents Based on Morphological Reconstruction.
B. V. DhandraMallikarjun HangargeRavindra S. HegadiV. S. MalemathPublished in: ICDIM (2006)
Keyphrases
- multiword
- parallel corpus
- word pairs
- parallel corpora
- bilingual lexicon
- word frequencies
- word spotting
- source language
- word alignment
- bilingual dictionaries
- indian languages
- document collections
- cross language information retrieval
- keywords
- cross lingual
- machine translation
- cross language
- sentence level
- document retrieval
- latent topics
- text corpus
- term frequency
- information retrieval
- document classification
- openings and closings
- natural language text
- word frequency
- english chinese
- printed documents
- related documents
- spoken documents
- statistical machine translation
- query translation
- document clustering
- query terms
- web documents
- language independent
- character n grams
- comparable corpora
- word recognition
- related words
- machine translation system
- text documents
- user queries
- target language
- text classification
- language model
- out of vocabulary
- semantic similarity
- co occurrence
- sentence pairs
- text corpora
- translation model
- semantic relations
- information extraction
- word sense disambiguation
- information retrieval systems
- document representation
- relevant documents