Semantically-grounded construction of centroids for datasets with textual attributes.
Sergio MartínezAïda VallsDavid SánchezPublished in: Knowl. Based Syst. (2012)
Keyphrases
- natural language
- construction process
- data sets
- metadata
- multimedia
- benchmark datasets
- attribute values
- categorical attributes
- k means
- training dataset
- class imbalanced data
- uci machine learning repository
- textual descriptions
- connected components
- semantic information
- data points
- keywords
- clustering algorithm
- search engine