Learning Contextual Bandits Through Perturbed Rewards.

Yiling Jia Weitong Zhang Dongruo Zhou Quanquan Gu Hongning Wang

Published in: CoRR (2022)

Keyphrases

learning process
reinforcement learning
multi armed bandits
learning tasks
learning systems
online learning
active learning
database
data sets
training data
prior knowledge
knowledge acquisition
context aware
learning algorithm
genetic algorithm
context sensitive
learning scheme
learning mechanism
elementary school