Login / Signup
Thompson Sampling in the Adaptive Linear Scalarized Multi Objective Multi Armed Bandit.
Saba Q. Yahyaa
Madalina M. Drugan
Bernard Manderick
Published in:
ICAART (2) (2015)
Keyphrases
</>
active learning
multi objective
multi armed bandit
evolutionary algorithm
multi objective optimization
multi armed bandits
reinforcement learning
genetic algorithm
regret bounds
objective function
probabilistic model
closed form
decentralized decision making