Login / Signup
Ilan Moscovitz
Publication Activity (10 Years)
Years Active: 2023-2024
Publications (10 Years): 2
Top Topics
False Positives
Rule Extraction
Detection Rate
Black Boxes
Top Venues
CoRR
ICLR
</>
Publications
</>
Lorenzo Pacchiardi
,
Alex James Chan
,
Sören Mindermann
,
Ilan Moscovitz
,
Alexa Y. Pan
,
Yarin Gal
,
Owain Evans
,
Jan Markus Brauner
How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions.
ICLR
(2024)
Lorenzo Pacchiardi
,
Alex J. Chan
,
Sören Mindermann
,
Ilan Moscovitz
,
Alexa Y. Pan
,
Yarin Gal
,
Owain Evans
,
Jan Brauner
How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions.
CoRR
(2023)