Login / Signup
Improved Sleeping Bandits with Stochastic Action Sets and Adversarial Rewards.
Aadirupa Saha
Pierre Gaillard
Michal Valko
Published in:
ICML (2020)
Keyphrases
</>
action sets
markov decision processes
reinforcement learning
multi armed bandits
stochastic systems
finite state
state space
bandit problems
multi agent
monte carlo
multi armed bandit
decision making
dynamic programming
markov chain
optimal policy
model checking