Login / Signup

Exploration-Driven Policy Optimization in RLHF: Theoretical Insights on Efficient Data Utilization.

Yihan DuAnna WinnickiGal DalalShie MannorR. Srikant
Published in: CoRR (2024)
Keyphrases