Language identification based on n-gram frequency ranking.
Ricardo de CórdobaLuis Fernando D'HaroFernando Fernández MartínezJavier Macías GuarasaJavier FerreirosPublished in: INTERSPEECH (2007)
Keyphrases
- n gram
- language identification
- language model
- language modeling
- text classification
- language independent
- speaker identification
- document images
- web search
- variable length
- speech recognition
- expert finding
- document ranking
- probabilistic model
- web information retrieval
- hidden markov models
- word level
- character n grams
- statistical language modeling