Labeling the Languages of Words in Mixed-Language Documents using Weakly Supervised Methods.
Ben KingSteven P. AbneyPublished in: HLT-NAACL (2013)
Keyphrases
- multilingual documents
- indian languages
- supervised methods
- source language
- target language
- semantic segmentation
- machine translation
- word sense disambiguation
- bilingual dictionaries
- cross lingual
- weakly supervised
- text documents
- active learning
- labeled data
- semi supervised
- object class
- keywords
- superpixels
- information retrieval
- document clustering
- conditional random fields
- labeled training data
- unsupervised methods
- natural language
- document classification
- methods require
- document collections
- unlabeled data
- wordnet
- image segmentation