Modeling OOV Words With Letter N-Grams in Statistical Taggers: Preliminary Work in Biomedical Entity Recognition.
Teemu RuokolainenMiikka SilfverbergPublished in: NODALIDA (2013)
Keyphrases
- n gram
- out of vocabulary
- statistical language modeling
- part of speech
- language model
- word segmentation
- bag of words
- language independent
- text classification
- language modeling
- pos tagging
- variable length
- information extraction
- language specific
- cross lingual
- data analysis
- neural network
- character n grams
- inside outside algorithm
- text mining
- test collection
- web documents
- knowledge discovery
- machine learning