Nearly Minimax Optimal Regret for Multinomial Logistic Bandit.

Joongkyu Lee Min-hwan Oh

Published in: CoRR (2024)

Keyphrases

worst case
regret bounds
minimax regret
multi armed bandit
bandit problems
optimal solution
text classification
online learning
logistic regression
dynamic programming
probabilistic model
loss function
expectation maximization
monte carlo
closed form
evaluation function
lower bound
decision trees