BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset.

Published in: CoRR (2023)

Keyphrases