Login / Signup

Reinforcement Learning Fine-tuning of Language Models is Biased Towards More Extractable Features.

Diogo CruzEdoardo PonaAlex Holness-ToftsElias SchmiedVíctor Abia AlonsoCharlie GriffinBogdan-Ionut Cirstea
Published in: CoRR (2023)
Keyphrases