Balancing Training Data for Automated Annotation of Keywords: a Case Study.
Gustavo E. A. P. A. BatistaAna L. C. BazzanMaria Carolina MonardPublished in: WOB (2003)
Keyphrases
- training data
- keywords
- training set
- classification accuracy
- supervised learning
- data sets
- learning algorithm
- search engine
- keyword extraction
- case study
- training examples
- test set
- web documents
- training samples
- domain knowledge
- decision trees
- training process
- keyword search
- test bed
- test data
- index terms
- neural network
- labeled data
- semantic information
- visual features
- text documents
- text retrieval
- input data
- information retrieval systems
- small number
- training dataset
- textual information
- training instances
- relational databases
- unseen data
- web pages