Separating the Wheat from the Chaff with BREAD: An open-source benchmark and metrics to detect redundancy in text.
Isaac CaswellLisa WangIsabel PapadimitriouPublished in: CoRR (2023)
Keyphrases
- open source
- open source software
- information retrieval
- keywords
- detection algorithm
- detection method
- text mining
- text retrieval
- free text
- source code
- automatically extracted
- comparative analysis
- core components
- natural language generation
- document analysis
- automatic detection
- text documents
- real world
- textual information
- text summarization
- face detection
- latent semantic analysis
- complex background
- text processing
- natural language text
- string matching
- image quality
- similarity metrics
- sentence level
- neural network