Text Deduplication with Minimum Loss Ratio.
Youming GeJiefeng WuGenan DaiYubao LiuPublished in: ICMLC (2019)
Keyphrases
- text retrieval
- free text
- text mining
- minimum cost
- keywords
- text processing
- web documents
- textual data
- case study
- digital libraries
- information extraction
- semantic information
- artificial intelligence
- standard deviation
- information retrieval
- database
- record linkage
- data cleaning
- text analysis
- news stories
- text data
- text documents
- pattern matching
- np hard
- data structure
- data mining