Deeper Delta Across Genres and Languages: Do We Really Need the Most Frequent Words?
Jan RybickiMaciej EderPublished in: DH (2010)
Keyphrases
- arabic language
- expressive power
- n gram
- semistructured documents
- language specific
- language independent
- frequency counts
- word order
- word forms
- keywords
- arabic documents
- multilingual documents
- indian languages
- word segmentation
- text summarization
- databases
- mining frequent
- context free
- language identification
- genre classification
- multiword
- word recognition
- word sense disambiguation
- text documents
- compound words
- frequent patterns