Integrated Parallel Sentence and Fragment Extraction from Comparable Corpora: A Case Study on Chinese-Japanese Wikipedia.
Chenhui ChuToshiaki NakazawaSadao KurohashiPublished in: ACM Trans. Asian Low Resour. Lang. Inf. Process. (2016)
Keyphrases
- bilingual lexicon
- comparable corpora
- text summarization
- semi automatically
- parallel corpora
- wikipedia articles
- cross language information retrieval
- information extraction
- machine translation
- sentence level
- word pairs
- natural language
- semantic relations
- news articles
- named entities
- wordnet
- language modeling
- text documents
- natural language processing
- knowledge base
- linguistic resources
- target language
- link structure
- translation model
- text corpora
- source language
- bi directional
- machine learning
- information retrieval