HumSet: Dataset of Multilingual Information Extraction and Classification for Humanitarian Crises Response.
Selim FekihNicolò TamagnoneBenjamin MinixhoferRanjan ShresthaXimena ContlaEwan OglethorpeNavid RekabsazPublished in: EMNLP (Findings) (2022)
Keyphrases
- information extraction
- classification accuracy
- machine learning
- automatic classification
- benchmark datasets
- pattern recognition
- uci datasets
- feature set
- feature selection
- classification scheme
- information retrieval
- classification systems
- feature extraction
- document classification
- text mining
- decision trees
- image classification
- question answering
- feature vectors
- classification models
- classification method
- classification algorithm
- class labels
- text categorization
- unsupervised learning
- text classification
- web documents
- support vector machine svm
- text documents
- machine learning methods
- classification rules
- pattern classification
- natural language processing
- support vector machine
- natural language
- training set
- fold cross validation
- data sets