Charagram: Embedding Words and Sentences via Character n-grams.
John WietingMohit BansalKevin GimpelKaren LivescuPublished in: CoRR (2016)
Keyphrases
- character n grams
- n gram
- variable length
- language model
- cross language
- cross language information retrieval
- natural language
- language specific
- arabic documents
- language independent
- bag of words
- multiword
- text classification
- optical character recognition
- document level
- vector space
- sentence level
- language modeling
- semantic roles
- out of vocabulary
- document representation
- linguistic features
- feature selection
- text retrieval
- document collections
- co occurrence
- probabilistic model
- web pages