A Factory of Comparable Corpora from Wikipedia.
Alberto Barrón-CedeñoCristina España-BonetJosu BoldobaLluís MàrquezPublished in: BUCC@ACL/IJCNLP (2015)
Keyphrases
- comparable corpora
- wikipedia articles
- parallel corpora
- cross language information retrieval
- bilingual lexicon
- news articles
- language modeling
- word pairs
- machine translation
- text corpora
- semantic relations
- wordnet
- text documents
- semi automatically
- knowledge base
- document collections
- named entities
- link structure
- bilingual dictionaries
- cross language
- semantic features
- language independent
- cross lingual
- co occurrence
- information extraction