Login / Signup
Aengus Lynch
Publication Activity (10 Years)
Years Active: 2022-2024
Publications (10 Years): 7
Top Topics
Open Problems
Binary Vectors
Machine Learning
Semi Automated
Top Venues
CoRR
NeurIPS
</>
Publications
</>
Abhay Sheshadri
,
Aidan Ewart
,
Phillip Guo
,
Aengus Lynch
,
Cindy Wu
,
Vivek Hebbar
,
Henry Sleight
,
Asa Cooper Stickland
,
Ethan Perez
,
Dylan Hadfield-Menell
,
Stephen Casper
Targeted Latent Adversarial Training Improves Robustness to Persistent Harmful Behaviors in LLMs.
CoRR
(2024)
Aengus Lynch
,
Phillip Guo
,
Aidan Ewart
,
Stephen Casper
,
Dylan Hadfield-Menell
Eight Methods to Evaluate Robust Unlearning in LLMs.
CoRR
(2024)
Daniel Tan
,
David Chanin
,
Aengus Lynch
,
Dimitrios Kanoulas
,
Brooks Paige
,
Adrià Garriga-Alonso
,
Robert Kirk
Analyzing the Generalization and Reliability of Steering Vectors.
CoRR
(2024)
Aengus Lynch
,
Gbètondji J.-S. Dovonon
,
Jean Kaddour
,
Ricardo Silva
Spawrious: A Benchmark for Fine Control of Spurious Correlation Biases.
CoRR
(2023)
Arthur Conmy
,
Augustine N. Mavor-Parker
,
Aengus Lynch
,
Stefan Heimersheim
,
Adrià Garriga-Alonso
Towards Automated Circuit Discovery for Mechanistic Interpretability.
CoRR
(2023)
Arthur Conmy
,
Augustine N. Mavor-Parker
,
Aengus Lynch
,
Stefan Heimersheim
,
Adrià Garriga-Alonso
Towards Automated Circuit Discovery for Mechanistic Interpretability.
NeurIPS
(2023)
Jean Kaddour
,
Aengus Lynch
,
Qi Liu
,
Matt J. Kusner
,
Ricardo Silva
Causal Machine Learning: A Survey and Open Problems.
CoRR
(2022)