Login / Signup

Reward Learning as Doubly Nonparametric Bandits: Optimal Design and Scaling Laws.

Kush BhatiaWenshuo GuoJacob Steinhardt
Published in: CoRR (2023)
Keyphrases