How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions.
Lorenzo PacchiardiAlex J. ChanSören MindermannIlan MoscovitzAlexa Y. PanYarin GalOwain EvansJan BraunerPublished in: CoRR (2023)
Keyphrases
- black box
- black boxes
- white box
- automatic detection
- object detection
- expert systems
- artificial intelligence
- integration testing
- case based reasoning
- detection method
- machine learning
- false alarms
- detection algorithm
- test cases
- hybrid systems
- rule extraction
- white box testing
- detection rate
- false positives
- intelligent systems
- open source
- computational intelligence
- answer questions
- knowledge representation
- decision trees
- neural network
- databases
- data sets