Data extraction as text categorization: an experiment with the MUC-3 corpus.
David D. LewisPublished in: MUC (1991)
Keyphrases
- text categorization
- data extraction
- text collections
- semi structured
- training documents
- feature selection
- knn
- text classification
- data integration
- k nearest neighbor
- text documents
- semi supervised learning
- text classifiers
- tf idf
- neural network
- web pages
- supervised learning
- web databases
- databases
- automatic text categorization
- unlabeled data
- artificial intelligence
- machine learning
- feature selections