Text Classification In The Wild: A Large-Scale Long-Tailed Name Normalization Dataset.
Jiexing QiShuhao LiZhixin GuoYusheng HuangChenghu ZhouWeinan ZhangXinbing WangZhouhan LinPublished in: ICASSP (2023)
Keyphrases
- text classification
- text classification tasks
- small scale
- text documents
- labeled data
- benchmark datasets
- million images
- text data
- bag of words
- naive bayes
- text categorization
- feature selection
- real life
- text mining
- n gram
- text classifiers
- knn
- document classification
- machine learning
- semantic features
- database
- multi label
- normalization method
- data sets
- real world
- gaussian distribution
- sentiment analysis
- data sources
- knowledge discovery