Efficient out-of-vocabulary term detection by n-gram array indices with distance from a syllable lattice.
Keisuke IwamiYasuhisa FujiiKazumasa YamamotoSeiichi NakagawaPublished in: ICASSP (2011)
Keyphrases
- n gram
- out of vocabulary
- language model
- text classification
- bag of words
- language independent
- language specific
- word segmentation
- language modeling
- query terms
- part of speech
- web documents
- viterbi algorithm
- word level
- machine learning
- test collection
- speech recognition
- probabilistic model
- term frequency
- natural language