Login / Signup
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.
Andrew Lee
Xiaoyan Bai
Itamar Pres
Martin Wattenberg
Jonathan K. Kummerfeld
Rada Mihalcea
Published in:
CoRR (2024)
Keyphrases
</>
optimization problems
orders of magnitude
machine learning
learning algorithm
theoretical analysis
neural network
real world
data structure
machine learning algorithms
data mining algorithms
data mining
significant improvement
worst case
recently developed
test bed