Distribution of Multi-Words in Chinese and English Documents.
Wen ZhangTaketoshi YoshidaXijin J. TangPublished in: Int. J. Inf. Technol. Decis. Mak. (2009)
Keyphrases
- chinese text
- unknown words
- person names
- stop words
- arabic documents
- multiword
- word segmentation
- english chinese
- text documents
- keyword extraction
- english words
- parallel corpus
- chinese texts
- keywords
- linguistic analysis
- arabic language
- indian languages
- chinese characters
- parallel corpora
- word spotting
- cross language information retrieval
- chinese language
- chinese english
- source language
- training corpus
- document collections
- machine translation
- word sense
- related words
- topic hierarchy
- word level
- monolingual retrieval
- word pairs
- chinese word segmentation
- text corpora
- document analysis
- document representation
- cross language
- chinese web
- text summarization
- bilingual dictionaries
- part of speech
- text retrieval
- cross lingual
- natural language
- text mining
- n gram
- document retrieval
- word sense disambiguation
- english text
- information retrieval
- language identification
- document clustering
- query translation
- text classification
- dependency parser
- semantic roles
- character recognition
- pos tagging
- tf idf
- language learning
- relevant documents
- named entities
- text categorization