TextBenDS: a generic Textual data Benchmark for Distributed Systems.
Ciprian-Octavian TruicaElena ApostolJérôme DarmontIra AssentPublished in: CoRR (2021)
Keyphrases
- distributed systems
- textual data
- information extraction
- structured data
- text mining
- fault tolerant
- natural language processing
- distributed environment
- textual information
- raw data
- text documents
- text data
- text categorization
- high level
- generative model
- formal concept analysis
- web documents
- data sets
- nearest neighbor
- low level
- active learning
- natural language
- machine learning