Double array structures based on byte segmentation for n-gram.
Masao FuketaKazuhiro MoritaJun-ichi AoePublished in: Int. J. Comput. Appl. Technol. (2015)
Keyphrases
- n gram
- word segmentation
- language model
- language modeling
- language independent
- bag of words
- text classification
- variable length
- segmentation algorithm
- image segmentation
- language modelling
- part of speech
- level set
- viterbi algorithm
- text mining
- digital libraries
- cross lingual
- web pages
- search engine
- information retrieval
- machine learning