Login / Signup
Regret Minimisation in Multi-Armed Bandits Using Bounded Arm Memory.
Arghya Roy Chaudhuri
Shivaram Kalyanakrishnan
Published in:
CoRR (2019)
Keyphrases
</>
multi armed bandits
bandit problems
multi armed bandit problems
decision problems
multi armed bandit
learning algorithm
reinforcement learning
objective function
multi objective