Login / Signup
ASTPrompter: Weakly Supervised Automated Language Model Red-Teaming to Identify Likely Toxic Prompts.
Amelia F. Hardy
Houjun Liu
Bernard Lange
Mykel J. Kochenderfer
Published in:
CoRR (2024)
Keyphrases
</>
language model
weakly supervised
probabilistic model
n gram
topic models
object class
information retrieval
superpixels
mixture model
relation extraction
semi supervised
object detection
bayesian networks
named entities
visual features
graph cuts
co occurrence
information extraction
pairwise