Login / Signup
Provably Efficient Reinforcement Learning for Adversarial Restless Multi-Armed Bandits with Unknown Transitions and Bandit Feedback.
Guojun Xiong
Jian Li
Published in:
CoRR (2024)
Keyphrases
</>
reinforcement learning
multi armed bandits
optimal control
bandit problems
multi agent
machine learning
multi armed bandit
state space
markov chain
function approximation
dynamic programming
least squares
online learning
maximum likelihood
reinforcement learning algorithms