Thompson Sampling Achieves $\tilde{O}(\sqrt{T})$ Regret in Linear Quadratic Control.

Published in: COLT (2022)

Keyphrases