WASSUP? LOL : Characterizing Out-of-Vocabulary Words in Twitter.
Suman Kalyan MaityChaitanya SardaAnshit E. ChaudharyAbhijeet PatilShraman KumarAkash MondalAnimesh MukherjeePublished in: CoRR (2016)
Keyphrases
- probabilistic model
- out of vocabulary
- language model
- n gram
- language modeling
- query words
- spoken document retrieval
- social media
- word segmentation
- speech recognition
- named entity recognition
- test collection
- conditional random fields
- broadcast news
- cross language information retrieval
- hand crafted
- query terms
- cross lingual
- parallel corpora
- user generated content
- word recognition
- previously unseen
- named entities