The impact of imbalanced training data on machine learning for author name disambiguation.
Jinseok KimJenna KimPublished in: CoRR (2018)
Keyphrases
- machine learning
- training data
- learning algorithm
- decision trees
- supervised learning
- machine learning algorithms
- explanation based learning
- learning tasks
- machine learning methods
- semi supervised learning
- knowledge acquisition
- data sets
- support vector machine
- natural language processing
- test set
- information extraction
- natural language
- classification accuracy
- model selection
- training instances
- active learning
- knowledge representation
- inductive learning
- class distribution
- feature selection
- binary classification problems
- computer vision
- machine learning approaches
- class imbalance
- training process
- semi supervised
- test data
- data analysis
- domain knowledge
- text mining