Automatic Normalization of Word Variations in Code-Mixed Social Media Text.
Rajat SinghNurendra ChoudharyManish ShrivastavaPublished in: CICLing (1) (2018)
Keyphrases
- social media
- keywords
- text corpus
- user comments
- source code
- natural language text
- sentence level
- english words
- text retrieval
- compressed text
- text segments
- related words
- word counts
- text documents
- social networks
- lexical features
- chinese text
- word sense
- multiword
- string matching
- text input
- syntactic categories
- english text
- stop words
- lexical information
- linguistic information
- word frequency
- printed documents
- word level
- information retrieval
- training corpus
- word pairs
- noun phrases
- word sense disambiguation
- syntactic analysis
- semantic information
- co occurrence
- text mining
- printed text
- sentence similarity
- spoken documents
- natural language processing
- cursive handwriting