Intelligent Content Based Title and Author Name Extraction from Formatted Documents.
Eric G. BerkowitzMohamed Reda ElkhadiriTim SahouriMichel AbrahamPublished in: MAICS (2004)
Keyphrases
- keywords
- xml documents
- image retrieval
- document collections
- relevant documents
- document classification
- information retrieval
- authorship attribution
- document retrieval
- text documents
- multimedia
- information retrieval systems
- retrieval systems
- relevance feedback
- vector space
- document analysis
- web documents
- plagiarism detection
- intelligent systems
- vector space model
- database
- extensible markup language
- legal documents
- digital documents
- document clustering
- metadata
- structured data
- document representation
- document set
- semi structured data
- search engine
- ranked list
- electronic documents
- intelligent search
- user queries
- text queries