Improving the Compression Performance of Turkish Text with PoS Tags.
Ebru CelikelBekir Taner DinçerPublished in: IKE (2004)
Keyphrases
- part of speech
- syntactic information
- training corpus
- linguistic information
- text documents
- text compression
- syntactic categories
- keywords
- text retrieval
- semantic information
- multi lingual
- pos tagging
- compression ratio
- image compression
- natural language processing
- semantic tags
- information retrieval
- text mining
- textual content
- tf idf
- textual features
- free text
- word sense disambiguation
- data compression
- n gram
- natural language text
- scanned documents
- compression algorithm
- metadata
- linguistic features
- web resources
- compressed text
- blog entries
- text classification