Login / Signup
Perplexed by Quality: A Perplexity-based Method for Adult and Harmful Content Detection in Multilingual Heterogeneous Web Data.
Tim Jansen
Yangling Tong
Victoria Zevallos
Pedro Ortiz Suarez
Published in:
CoRR (2022)
Keyphrases
</>
web data
detection method
information retrieval
database
databases
machine learning
digital libraries
relational databases
language model
web mining
data sets
metadata
multimedia
semi structured