Comparison of the scanned pages of the contractual documents.
Elena AndreevaVladimir V. ArlazarovTemudzhin ManzhikovOleg SlavinPublished in: ICMV (2017)
Keyphrases
- web documents
- keywords
- page layout
- information retrieval
- website
- textual content
- information retrieval systems
- document collections
- document images
- search engine
- document classification
- document retrieval
- relevant documents
- xml documents
- web pages
- metadata
- web information
- scanned documents
- scanned document images
- content similarity
- web crawler
- focused crawling
- document clustering
- html pages
- structured documents
- scanned images
- historical documents
- vector space model
- www pages