Empirical Evaluation of Semi-automated XML Annotation of Text Documents with the GoldenGATE Editor.
Guido SautterKlemens BöhmFrank PadbergWalter F. TichyPublished in: ECDL (2007)
Keyphrases
- empirical evaluation
- semi automated
- text documents
- metadata
- text mining
- fully automated
- text categorization
- xml documents
- information extraction
- text classification
- xml data
- topic models
- wordnet
- text data
- document clustering
- bag of words
- keywords
- textual information
- active learning
- xml schema
- document representation
- databases
- text collections
- structured data
- data sets
- k nearest neighbor
- training data
- fully automatic
- prior knowledge
- semi supervised learning
- question answering
- feature extraction
- feature selection
- neural network
- information retrieval systems
- knowledge discovery
- nearest neighbor