Replicated Documents for the World-Wide Web.
Henning PagniaOliver E. TheelJürgen IwikPublished in: EUROMEDIA (1998)
Keyphrases
- information retrieval systems
- information retrieval
- document clustering
- document retrieval
- document collections
- xml documents
- web documents
- document classification
- web information
- text documents
- legal documents
- fault tolerant
- fault tolerance
- document representation
- text analysis
- logical structure
- free text
- digital libraries
- keywords
- machine learning
- html pages
- digital documents
- document content
- web pages
- plagiarism detection
- time stamped
- expert finding
- textual content
- semi structured
- multimedia documents
- document set
- natural language
- vector space model
- relational databases
- vector space
- web data
- database