Document normalization revisited.
Abdur ChowdhuryM. Catherine McCabeDavid A. GrossmanOphir FriederPublished in: SIGIR (2002)
Keyphrases
- retrieval systems
- document images
- document collections
- document retrieval
- text documents
- information retrieval
- keywords
- structured documents
- document classification
- document clustering
- database
- web documents
- information retrieval systems
- textual content
- vector space model
- document ranking
- document structure
- digital documents
- semantic information
- web search
- preprocessing
- digital libraries
- website
- web pages
- databases
- real time