Finite-Time Frequentist Regret Bounds of Multi-Agent Thompson Sampling on Sparse Hypergraphs.

Tianyuan Jin Hao-Lun Hsu William Chang Pan Xu

Published in: CoRR (2023)

Keyphrases

multi agent
multi armed bandit
regret bounds
reinforcement learning
high dimensional
lower bound
online learning
random sampling
sampling algorithm
feature selection
sparse representation
negative matrix factorization