Measuring documents similarity in large corpus using MapReduce algorithm.
Marouane BirjaliAbderrahim Beni HssaneMohammed ErritaliPublished in: ICMCS (2016)
Keyphrases
- learning algorithm
- similarity measure
- preprocessing
- optimal solution
- computational complexity
- distance metric
- detection algorithm
- probabilistic model
- expectation maximization
- simulated annealing
- information retrieval
- dynamic programming
- objective function
- worst case
- segmentation algorithm
- clustering method
- similarity function