Searching Four-Millenia-Old Digitized Documents: A Text Retrieval System for Egyptologists.
Estíbaliz Iglesias-FranjoJesús VilaresPublished in: LaTeCH@ACL (2016)
Keyphrases
- information retrieval
- text documents
- retrieval engine
- information retrieval systems
- historical manuscripts
- retrieval systems
- document repositories
- digital documents
- web documents
- free text
- text collections
- text retrieval
- plagiarism detection
- document categorization
- document content
- newspaper articles
- text information
- document analysis
- keywords
- text analysis
- textual content
- textual data
- text content
- document collections
- automatic categorization
- latent semantic analysis
- document processing
- textual information
- multimedia documents
- textual documents
- relevant documents
- text mining
- semantic information
- historical documents
- information extraction
- document set
- text clustering
- handwritten text
- string matching
- text data
- printed documents
- linguistic analysis
- electronic documents
- effective retrieval
- document representation
- text corpus
- document structure
- structured documents
- digital libraries
- language model
- word spotting
- topic segmentation
- key concepts
- natural language text
- document clustering
- retrieval strategies
- retrieval method
- handwriting recognition
- relevance feedback
- search engine
- page layout
- text corpora
- text lines
- journal articles
- scanned documents
- document level
- document retrieval
- handwritten documents
- text categorization
- retrieval model
- test collection