Automatically extracted parallel corpora enriched with highly useful metadata? A Wikipedia case study combining machine learning and social technology.
Ahmad AghaebrahimianAndy StauderMichael UstaszewskiPublished in: Digit. Scholarsh. Humanit. (2021)
Keyphrases
- automatically extracted
- case study
- parallel corpora
- machine learning
- metadata
- wikipedia articles
- machine translation
- semantic information
- digital libraries
- software development
- labor intensive
- cross language information retrieval
- text classification
- cross language
- natural language processing
- language independent
- cross lingual
- knowledge base
- social media
- document collections
- text mining
- knowledge representation
- user generated content
- feature selection
- information extraction
- social networks
- search engine