Login / Signup
Dataset Characteristics and Their Impact on Offline Policy Learning of Contextual Multi-Armed Bandits.
Piotr Januszewski
Dominik Grzegorzek
Pawel Czarnul
Published in:
ICAART (2) (2024)
Keyphrases
</>
multi armed bandits
learning process
supervised learning
learning algorithm
reinforcement learning
active learning
online learning
optimal policy
learning tasks