A Realistic Dataset for Performance Evaluation of Document Layout Analysis.
Apostolos AntonacopoulosDavid BridsonChristos PapadopoulosStefan PletschacherPublished in: ICDAR (2009)
Keyphrases
- real life
- benchmark datasets
- document images
- database
- retrieval systems
- document collections
- real world
- information retrieval systems
- web documents
- information retrieval
- document analysis
- document clustering
- text documents
- data sets
- document retrieval
- feature set
- keywords
- user queries
- text classification
- relevant documents
- text mining
- document classification
- structured documents
- website
- document structure
- document processing