Login / Signup
Reward Learning as Doubly Nonparametric Bandits: Optimal Design and Scaling Laws.
Kush Bhatia
Wenshuo Guo
Jacob Steinhardt
Published in:
AISTATS (2023)
Keyphrases
</>
optimal design
reinforcement learning
learning algorithm
learning process
learning systems
learning tasks
prior knowledge
active learning
learning community
supervised learning
knowledge acquisition
mobile learning
neural network
upper bound
online learning
multi armed bandits