A novel unsupervised corpus-based stemming technique using lexicon and corpus statistics.
Jasmeet SinghVishal GuptaPublished in: Knowl. Based Syst. (2019)
Keyphrases
- semantic lexicon
- unsupervised learning
- weakly supervised
- semi supervised
- n gram
- lexical semantics
- pos tagging
- open domain
- natural language
- supervised learning
- test set
- text corpus
- information retrieval
- unsupervised manner
- descriptive statistics
- stop words
- sentence level
- lexical units
- statistical machine translation
- multiword
- manually annotated
- natural language text
- data driven
- clustering algorithm