Login / Signup
A Dominant Strategy Truthful, Deterministic Multi-Armed Bandit Mechanism with Logarithmic Regret.
Divya Padmanabhan
Satyanath Bhat
Prabuchandran K. J.
Shirish K. Shevade
Y. Narahari
Published in:
CoRR (2017)
Keyphrases
</>
regret bounds
multi armed bandit
multi armed bandits
online learning
lower bound
linear regression
mechanism design
reinforcement learning
decentralized decision making
upper bound
worst case
probabilistic model
least squares
maximum likelihood
bandit problems
strategy proof