Improving Version-Aware Word Documents.
Alexandre Azevedo FilhoEthan V. MunsonCheng ThaoPublished in: DocEng (2017)
Keyphrases
- word spotting
- word frequencies
- information retrieval
- text corpus
- keywords
- natural language text
- sentence similarity
- document collections
- web documents
- document classification
- text documents
- relevant documents
- document clustering
- word frequency
- index terms
- multiword
- latent topics
- printed documents
- n gram
- stop words
- word pairs
- page layout
- spoken documents
- linguistic information
- information retrieval systems
- document retrieval
- sentence level
- word co occurrence
- co occurrence
- handwritten documents
- related words
- term weighting
- training corpus
- metadata
- xml documents
- information extraction
- text mining
- multi document summarization
- word similarity
- document space
- related documents
- term frequency
- vector space model
- document images
- word recognition
- parallel corpus
- noun phrases
- character n grams
- historical documents
- concept space
- word sense disambiguation
- document representation