Semantic Sequence Kin: A Method of Document Copy Detection.
Jun-Peng BaoJun-Yi ShenXiao-Dong LiuHaiyan LiuXiao-Di ZhangPublished in: PAKDD (2004)
Keyphrases
- synthetic data
- high level
- segmentation method
- clustering method
- pairwise
- significant improvement
- high accuracy
- data sets
- cost function
- detection method
- semantic web
- experimental evaluation
- preprocessing
- objective function
- computational cost
- concept hierarchy
- copy detection
- semantic information
- input data
- knowledge base
- feature selection