Deeper Delta across genres and languages: do we really need the most frequent words?
Jan RybickiMaciej EderPublished in: Lit. Linguistic Comput. (2011)
Keyphrases
- arabic language
- language specific
- n gram
- language independent
- frequency counts
- expressive power
- multilingual documents
- word forms
- arabic documents
- word order
- language identification
- cross lingual
- genre classification
- mining frequent
- databases
- european languages
- pos taggers
- grammatical inference
- syntactic categories
- compound words
- semistructured documents
- word pairs
- word recognition
- word segmentation
- target language
- word sense disambiguation
- semantic web
- natural language processing
- probabilistic model
- natural language
- keywords
- information retrieval