Login / Signup
On the Identification and Mitigation of Weaknesses in the Knowledge Gradient Policy for Multi-Armed Bandits.
James Edwards
Paul Fearnhead
Kevin D. Glazebrook
Published in:
CoRR (2016)
Keyphrases
</>
multi armed bandits
knowledge base
least squares
optimal policy
closed form