Login / Signup

Dataset Characteristics and Their Impact on Offline Policy Learning of Contextual Multi-Armed Bandits.

Piotr JanuszewskiDominik GrzegorzekPawel Czarnul
Published in: ICAART (2) (2024)
Keyphrases
  • multi armed bandits
  • learning process
  • supervised learning
  • learning algorithm
  • reinforcement learning
  • active learning
  • online learning
  • optimal policy
  • learning tasks