Helmholtz Principle on word embeddings for automatic document segmentation.
Dominik KrzeminskiHelen BalinskyAlexander BalinskyPublished in: DocEng (2018)
Keyphrases
- helmholtz principle
- fully automatic
- word segmentation
- perceptual grouping
- segmentation algorithm
- image segmentation
- medical images
- level set
- multiscale
- keywords
- numeral strings
- segmentation method
- information retrieval
- document images
- latent topics
- information retrieval systems
- text lines
- term frequency
- document analysis
- printed documents
- compound words
- n gram
- tf idf
- multiple objects
- energy function
- word recognition
- graph cuts
- co occurrence
- computer vision