Comparing Clustering Techniques on Brazilian Legal Document Datasets.
João Pedro LimaJosé Alfredo F. CostaPublished in: HAIS (2022)
Keyphrases
- document clustering
- clustering algorithm
- text clustering
- clustering approaches
- database
- clustering method
- high dimensional datasets
- synthetic and real datasets
- tolerance rough set
- synthetic datasets
- k means
- topic discovery
- data mining tasks
- hierarchical clustering
- document collections
- information retrieval
- cosine similarity
- cluster membership
- tf idf
- unsupervised learning
- categorical data
- document classification
- structured documents
- retrieval systems
- document images
- logical structure
- electronic documents
- document clusters
- data sets
- clustering analysis
- fuzzy clustering
- spectral clustering
- benchmark datasets
- legal knowledge
- data mining