Kencorpus: A Kenyan Language Corpus of Swahili, Dholuo and Luhya for Natural Language Processing Tasks.
Barack Wamkaya WanjawaLilian WanzareFlorence IndedeOwen McOnyangoEdward OmbuiLawrence MuchemiPublished in: CoRR (2022)
Keyphrases
- natural language processing
- natural language
- computational linguistics
- language processing
- linguistic knowledge
- machine learning
- broad coverage
- feature engineering
- information extraction
- free text
- sentiment analysis
- question answering
- text corpora
- programming language
- text understanding
- linguistic analysis
- language learning
- transfer learning
- named entity recognition
- semantic analysis
- computational biology
- text mining
- test set
- reference resolution
- web pages