Login / Signup
Suraj Anand
Publication Activity (10 Years)
Years Active: 2019-2024
Publications (10 Years): 4
Top Topics
Learning Contexts
Term Dependencies
Weighting Scheme
Feedback Information
Top Venues
CoRR
EMOOCs-WIP
</>
Publications
</>
Suraj Anand
,
David Getzen
Are PPO-ed Language Models Hackable?
CoRR
(2024)
Louis Castricato
,
Nathan Lile
,
Suraj Anand
,
Hailey Schoelkopf
,
Siddharth Verma
,
Stella Biderman
Suppressing Pink Elephants with Direct Principle Feedback.
CoRR
(2024)
Suraj Anand
,
Michael A. Lepori
,
Jack Merullo
,
Ellie Pavlick
Dual Process Learning: Controlling Use of In-Context vs. In-Weights Strategies with Weight Forgetting.
CoRR
(2024)
Suraj Anand
,
Francesca Bonadei
Springer Nature and online courses with iversity.
EMOOCs-WIP
(2019)