Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations.
Swapnaja AchintalwarAdriana Alvarado GarciaAteret Anaby-TavorIoana BaldiniSara E. BergerBishwaranjan BhattacharjeeDjallel BouneffoufSubhajit ChaudhuryPin-Yu ChenLamogha ChiazorElizabeth M. DalyRogério Abreu de PaulaPierre L. DogninEitan FarchiSoumya GhoshMichael HindRaya HoreshGeorge KourJa Young LeeErik MiehlingKeerthiram MurugesanManish NagireddyInkit PadhiDavid PiorkowskiAmbrish RawatOrna RazPrasanna SattigeriHendrik StrobeltSarathkrishna SwaminathanChristoph TillmannAashka TrivediKush R. VarshneyDennis WeiShalisha WitherspoonMarcel ZalmanoviciPublished in: CoRR (2024)