An impact of linguistic features on automated classification of OCR texts.
Gudila Paul MoshiLazaro S. P. BusagalaWataru OhyamaTetsushi WakabayashiFumitaka KimuraPublished in: Document Analysis Systems (2010)
Keyphrases
- context sensitive
- linguistic features
- automated classification
- linguistic information
- text classification
- named entities
- structural features
- semantic features
- linguistic knowledge
- sentence level
- feature set
- part of speech
- named entity recognition
- news stories
- natural language processing
- information retrieval
- keywords
- image retrieval
- statistical model
- semantic information
- text mining