Scaling to Large3 Data: An Efficient and Effective Method to Compute Distributional Thesauri.
Martin RiedlChris BiemannPublished in: EMNLP (2013)
Keyphrases
- synthetic data
- data sets
- test data
- noisy data
- input data
- computationally efficient
- preprocessing
- high quality
- statistical methods
- data collection
- data processing
- data mining techniques
- high precision
- prior knowledge
- prior information
- missing values
- segmentation method
- detection method
- raw data
- spectral clustering
- data quality
- information loss
- image data
- significant improvement
- xml documents
- data analysis
- objective function
- training data
- similarity measure
- statistical analysis
- feature set
- relevance feedback
- data sources
- computational cost
- dynamic programming
- original data
- face recognition
- decision trees