Ancient-Modern Chinese Translation with a New Large Training Dataset.
Dayiheng LiuKexin YangQian QuJiancheng LvPublished in: ACM Trans. Asian Low Resour. Lang. Inf. Process. (2020)
Keyphrases
- training dataset
- training data
- chinese english
- english chinese
- training set
- machine translation
- cross language information retrieval
- training samples
- web corpora
- data samples
- chinese text
- support vectors
- word segmentation
- imbalanced datasets
- class labels
- query translation
- database
- decision trees
- english words
- feature extraction
- parallel corpora
- text summarization
- learning environment
- information extraction
- cultural heritage
- language model