Sign in

MapDupReducer: detecting near duplicates over massive datasets.

Chaokun WangJianmin WangXuemin LinWei WangHaixun WangHongsong LiWanpeng TianJun XuRui Li
Published in: SIGMOD Conference (2010)
Keyphrases
  • massive datasets
  • massive data
  • text data
  • big data
  • computationally challenging
  • databases
  • methods require
  • data sets
  • real world
  • data mining
  • multimedia
  • relational databases
  • spatial data
  • stored data