Login / Signup
Finite-Time Frequentist Regret Bounds of Multi-Agent Thompson Sampling on Sparse Hypergraphs.
Tianyuan Jin
Hao-Lun Hsu
William Chang
Pan Xu
Published in:
AAAI (2024)
Keyphrases
</>
multi agent
multi armed bandit
regret bounds
reinforcement learning
high dimensional
random sampling
sampling algorithm
sparse representation
linear regression
regularized least squares
lower bound
markov chain monte carlo
linear predictors