Automatic Normalization of Word Variations in Code-Mixed Social Media Text.
Rajat SinghNurendra ChoudharyManish ShrivastavaPublished in: CoRR (2018)
Keyphrases
- social media
- compressed text
- related words
- text corpus
- string matching
- text input
- user comments
- english text
- text retrieval
- word counts
- natural language text
- linguistic information
- word pairs
- chinese text
- keywords
- sentence level
- unknown words
- information retrieval
- word level
- text mining
- co occurrence
- stop words
- spoken documents
- english words
- social networks
- handwritten documents
- source code
- semantic information
- noun phrases
- big data
- named entity recognizer
- syntactic information
- n gram
- sentence similarity
- printed text
- user generated content
- text segments
- lexical features
- document analysis
- lexical information
- training corpus
- syntactic analysis