HU-PageScan: a fully convolutional neural network for document page crop.
Ricardo Batista das Neves JuniorEstanislau LimaByron L. D. BezerraCleber ZanchettinAlejandro H. ToselliPublished in: IET Image Process. (2020)
Keyphrases
- convolutional neural network
- face detection
- page layout analysis
- keywords
- website
- www pages
- document type
- html documents
- document images
- information retrieval
- web pages
- web documents
- neural network
- document collections
- web crawler
- machine learning
- document classification
- document retrieval
- semantic information
- document clustering
- structured documents
- information retrieval systems
- document analysis
- retrieval systems
- page layout
- click logs
- wikipedia pages
- link analysis
- link structure
- textual content
- invariant moments
- user queries
- feature vectors