FOCA: A System for Classification, Digitalization and Information Retrieval of Trial Balance Documents.
Gokce Aydugan BaydarSeçil ArslanPublished in: DATA (2019)
Keyphrases
- information retrieval
- document collections
- information retrieval systems
- document classification
- automatic categorization
- relevant documents
- document categorization
- document retrieval
- automatic classification
- vector space model
- classification accuracy
- information extraction
- text classification
- text clustering
- text collections
- query expansion
- latent semantic indexing
- retrieval systems
- document clustering
- machine learning
- image classification
- feature extraction
- decision trees
- query terms
- text retrieval
- latent semantic analysis
- structured documents
- class labels
- language model
- support vector machine
- support vector
- feature selection
- pre classified