Login / Signup
Multi-Armed Bandits with Generalized Temporally-Partitioned Rewards.
Ronald C. van den Broek
Rik Litjens
Tobias Sagis
Luc Siecker
Nina Verbeeke
Pratik Gajane
Published in:
CoRR (2023)
Keyphrases
</>
multi armed bandits
bandit problems
multi armed bandit
temporal information
reinforcement learning
decision problems
learning algorithm
lower bound