Paradocs: un système d'identification automatique de documents parallèles.
Alexandre PatryPhilippe LanglaisPublished in: TALN (Articles longs) (2005)
Keyphrases
- information retrieval
- document collections
- information retrieval systems
- document classification
- document clustering
- relevant documents
- document retrieval
- xml documents
- retrieval systems
- database
- semantic information
- text documents
- legal documents
- multi document summarization
- web documents
- digital documents
- metadata
- document content
- plagiarism detection
- vector space
- retrieved documents
- document set
- text analysis
- vector space model
- free text
- query terms
- keywords
- neural network