Login / Signup
Abhay Sheshadri
Publication Activity (10 Years)
Years Active: 2024-2024
Publications (10 Years): 3
Top Topics
Single Step
Computer Software
Behavior Recognition
Computational Efficiency
Top Venues
CoRR
ACL (Findings)
</>
Publications
</>
Abhay Sheshadri
,
Aidan Ewart
,
Phillip Guo
,
Aengus Lynch
,
Cindy Wu
,
Vivek Hebbar
,
Henry Sleight
,
Asa Cooper Stickland
,
Ethan Perez
,
Dylan Hadfield-Menell
,
Stephen Casper
Targeted Latent Adversarial Training Improves Robustness to Persistent Harmful Behaviors in LLMs.
CoRR
(2024)
Jannik Brinkmann
,
Abhay Sheshadri
,
Victor Levoso
,
Paul Swoboda
,
Christian Bartelt
A Mechanistic Analysis of a Transformer Trained on a Symbolic Multi-Step Reasoning Task.
ACL (Findings)
(2024)
Jannik Brinkmann
,
Abhay Sheshadri
,
Victor Levoso
,
Paul Swoboda
,
Christian Bartelt
A Mechanistic Analysis of a Transformer Trained on a Symbolic Multi-Step Reasoning Task.
CoRR
(2024)