Login / Signup
Kai Williams
Publication Activity (10 Years)
Years Active: 2024-2024
Publications (10 Years): 2
Top Topics
Security Flaws
Fine Tuned
Watermarking Algorithm
Viable Alternative
Top Venues
CoRR
</>
Publications
</>
Domenic Rosati
,
Jan Wehner
,
Kai Williams
,
Lukasz Bartoszcze
,
David Atanasov
,
Robie Gonzales
,
Subhabrata Majumdar
,
Carsten Maple
,
Hassan Sajjad
,
Frank Rudzicz
Representation noising effectively prevents harmful fine-tuning on LLMs.
CoRR
(2024)
Domenic Rosati
,
Jan Wehner
,
Kai Williams
,
Lukasz Bartoszcze
,
Jan Batzner
,
Hassan Sajjad
,
Frank Rudzicz
Immunization against harmful fine-tuning attacks.
CoRR
(2024)