A novel segmentation technique for splitting a typed Persian text to sub-words.
Mahnaz ShafiiMaher A. Sid-AhmedMajid AhmadiPublished in: ISCCSP (2012)
Keyphrases
- word segmentation
- text retrieval
- text documents
- cursive script
- keywords
- topic segmentation
- text segmentation
- chinese text
- text recognition
- segmentation algorithm
- text lines
- syntactic categories
- word sense disambiguation
- english words
- proper nouns
- level set
- word pairs
- text databases
- image segmentation
- world knowledge
- related words
- short text
- linguistic information
- text representation
- numeral strings
- word level
- text corpus
- text corpora
- text mining
- arabic language
- noun phrases
- segmentation method
- keyword extraction
- lexical chains
- information retrieval
- word recognition
- higher order
- lexical features
- textual features
- text classification
- feature selection
- n gram
- newspaper articles
- document analysis
- text detection
- text categorization
- word co occurrence
- handwritten documents
- co occurrence
- multiscale
- training corpus