Login / Signup
BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset.
Jiaming Ji
Mickel Liu
Juntao Dai
Xuehai Pan
Chi Zhang
Ce Bian
Boyuan Zhang
Ruiyang Sun
Yizhou Wang
Yaodong Yang
Published in:
CoRR (2023)
Keyphrases
</>
improved algorithm
database
benchmark datasets
personality traits
neural network
data mining
human subjects
multi criteria
human interaction