BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset.

Published in: NeurIPS (2023)

Keyphrases