DUET: Detection Utilizing Enhancement for Text in Scanned or Captured Documents.
Eun-Soo JungHyeongGwan SonKyusam OhYongkeun YunSoonhwan KwonMin Soo KimPublished in: CoRR (2021)
Keyphrases
- text documents
- document images
- scanned documents
- information retrieval
- free text
- web documents
- text lines
- digital documents
- textual content
- keywords
- document processing
- document analysis
- text analysis
- page layout
- textual data
- scanned images
- text information
- scanned document images
- plagiarism detection
- textual documents
- document categorization
- text data
- text collections
- printed documents
- text clustering
- information retrieval systems
- document content
- text retrieval
- image enhancement
- document collections
- semantic information
- document set
- complex background
- text classifiers
- vector space model
- line extraction
- topic segmentation
- natural language text
- linguistic analysis
- latent semantic analysis
- retrieval engine
- electronic documents
- text detection
- newspaper articles
- automatic categorization
- metadata
- text corpus
- text mining
- relevant documents
- document clustering
- textual information
- key concepts
- scientific documents
- document retrieval
- document representation
- document structure
- news stories
- structured documents
- semantic content
- multimedia documents
- text corpora
- handwritten documents
- text content
- scientific literature
- journal articles
- wordnet
- natural language processing
- relevance feedback