Datasets: A Community Library for Natural Language Processing.
Quentin LhoestAlbert Villanova del MoralYacine JerniteAbhishek ThakurPatrick von PlatenSuraj PatilJulien ChaumondMariama DrameJulien PluLewis TunstallJoe DavisonMario SaskoGunjan ChhablaniBhavitvya MalikSimon BrandeisTeven Le ScaoVictor SanhCanwen XuNicolas PatryAngelina McMillan-MajorPhilipp SchmidSylvain GuggerClement DelangueThéo MatussièreLysandre DebutStas BekmanPierric CistacThibault GoehringerVictor MustarFrançois LagunasAlexander M. RushThomas WolfPublished in: CoRR (2021)
Keyphrases
- natural language processing
- machine learning
- information extraction
- natural language
- database
- knowledge representation
- benchmark datasets
- wordnet
- knowledge resources
- text mining
- case study
- online communities
- semantic analysis
- computational biology
- computational linguistics
- synthetic and real datasets
- social networks
- machine translation
- free text
- training dataset
- standard machine learning algorithms