Learning Neural Contextual Bandits through Perturbed Rewards.

Yiling Jia Weitong Zhang Dongruo Zhou Quanquan Gu Hongning Wang

Published in: ICLR (2022)

Keyphrases

learning systems
reinforcement learning
learning process
knowledge acquisition
data sets
neural network
multi armed bandits
incremental learning
learning tasks
supervised learning
prior knowledge
case study
contextual information
mobile learning
decision trees
learning algorithm
learning rules
motor control
machine learning