Foreign Words and the Automatic Processing of Arabic Social Media Text Written in Roman Script.
Ramy EskanderMohamed Al-BadrashinyNizar HabashOwen RambowPublished in: CodeSwitch@EMNLP (2014)
Keyphrases
- automatic processing
- indian languages
- language identification
- arabic language
- social media
- arabic text
- optical character recognition
- arabic documents
- text recognition
- printed text
- printed documents
- document images
- text documents
- unknown words
- keywords
- cross lingual
- word segmentation
- document level
- chinese text
- character n grams
- english words
- handwritten documents
- social networks
- speaker identification
- word level
- document analysis
- text mining
- user comments
- lexical features
- character recognition
- related words
- text summarization
- noun phrases
- writing style
- word spotting
- word pairs
- syntactic categories
- text classification
- text representation
- text retrieval
- linguistic information
- text corpus
- world knowledge
- n gram
- natural language text
- short text