Supporting Human Raters with the Detection of Harmful Content using Large Language Models.

Kurt Thomas Patrick Gage Kelley David Tao Sarah Meiklejohn Owen Vallis Shunwen Tan Blaz Bratanic Felipe Tiengo Ferreira Vijay Kumar Eranti Elie Bursztein

Published in: CoRR (2024)