Translate and Classify: Improving Sequence Level Classification for English-Hindi Code-Mixed Data.
Devansh GautamKshitij GuptaManish ShrivastavaPublished in: CALCS@NAACL (2021)
Keyphrases
- mixed data
- automatic classification
- decision trees
- classification accuracy
- machine translation
- automatically classify
- feature extraction
- feature selection
- language identification
- support vector
- natural language
- source code
- text classification
- neural network
- spoken language
- training set
- data sets
- cross language
- data compression
- feature vectors
- pre classified
- mixture of gaussian distributions