Login / Signup
Stochastic Multi-Armed Bandits with Strongly Reward-Dependent Delays.
Yifu Tang
Yingfei Wang
Zeyu Zheng
Published in:
AISTATS (2024)
Keyphrases
</>
multi armed bandits
multi armed bandit
bandit problems
reinforcement learning
decision problems
timed petri nets
dynamic programming
bayesian networks
special case
monte carlo