Login / Signup
Randomized Allocation with Nonparametric Estimation for Contextual Multi-Armed Bandits with Delayed Rewards.
Sakshi Arya
Yuhong Yang
Published in:
CoRR (2019)
Keyphrases
</>
multi armed bandits
nonparametric estimation
bandit problems
probability density
multi armed bandit
decision problems
probability density function
computational complexity
dynamic programming
probability distribution
upper bound
loss function
closed form