WASSUP? LOL : Characterizing Out-of-Vocabulary Words in Twitter.
Suman Kalyan MaityAnshit E. ChaudharyShraman KumarAnimesh MukherjeeChaitanya SardaAbhijeet PatilAkash MondalPublished in: CSCW Companion (2016)
Keyphrases
- out of vocabulary
- n gram
- language model
- word segmentation
- named entity recognition
- cross language information retrieval
- spoken document retrieval
- broadcast news
- query words
- parallel corpora
- hand crafted
- social media
- named entities
- cross lingual
- term frequency
- query terms
- query translation
- previously unseen
- retrieval model
- information retrieval
- probabilistic model
- information extraction
- text classification
- bag of words
- word recognition
- language independent
- language modeling
- document retrieval
- machine translation