Scalable Feature Extraction from Noisy Documents.
Loïc LecerfBoris ChidlovskiiPublished in: ICDAR (2009)
Keyphrases
- feature extraction
- document collections
- legal documents
- information retrieval systems
- xml documents
- information retrieval
- text documents
- web documents
- document clustering
- preprocessing
- keywords
- document retrieval
- wavelet transform
- free text
- document analysis
- digital documents
- metadata
- document representation
- vector space model
- noisy data
- discriminant analysis
- relevant documents
- frequency domain
- image processing
- image classification
- dimensionality reduction
- face recognition
- multi document summarization
- feature vectors
- document classification
- feature selection
- noisy environments
- multimedia documents
- structured documents
- iris recognition
- principal component analysis
- user queries
- text mining
- ranked list
- co occurrence
- linear discriminant analysis
- vector space
- texture features
- semantic information