Minimax Optimal Bandits for Heavy Tail Rewards.

Kyungjae Lee Sungbin Lim

Published in: IEEE Trans. Neural Networks Learn. Syst. (2024)

Keyphrases

worst case
multi armed bandits
reinforcement learning
control policy
neural network
data mining
special case
evaluation function
bandit problems