DBpedia Abstracts: A Large-Scale, Open, Multilingual NLP Training Corpus.
Martin BrümmerMilan DojchinovskiSebastian HellmannPublished in: LREC (2016)
Keyphrases
- training corpus
- part of speech
- grammar induction
- natural language processing
- information extraction
- text classification
- training data
- text mining
- statistical machine translation
- question answering
- natural language
- word sense disambiguation
- pos tagging
- machine translation
- semantic relations
- machine learning
- cross lingual
- cross language information retrieval
- document retrieval
- language independent
- cross language
- named entities
- bag of words
- knowledge base
- learning algorithm
- information retrieval