Identifying Historical Period and Ethnic Origin of Documents Using Stylistic Feature Sets.
Yaakov HaCohen-KernerHananya BeckElchai YehudaiDror MughazPublished in: Discovery Science (2006)
Keyphrases
- feature set
- feature selection
- random forest
- feature vectors
- document collections
- feature extraction
- historical documents
- feature space
- authorship attribution
- classification accuracy
- feature types
- information retrieval
- information retrieval systems
- extracted features
- keywords
- xml documents
- metadata
- structural features
- class labels
- data sets
- linguistic features
- document clustering
- text documents
- principal components
- feature subset
- audio features
- machine learning
- retrieval systems
- classification models
- closely related
- decision trees
- multiple features
- web pages
- learning algorithm
- data mining
- database
- page layout analysis