Login / Signup
A Distributed Algorithm for Sequential Decision Making in Multi-Armed Bandit with Homogeneous Rewards.
Jingxuan Zhu
Romeil Sandhu
Ji Liu
Published in:
CDC (2020)
Keyphrases
</>
learning algorithm
multi armed bandit
reinforcement learning
objective function
dynamic programming
cost function
sequential decision making
search space
expectation maximization
machine learning
optimal solution
support vector
worst case
convergence rate
multi armed bandits