Bandits with concave rewards and convex knapsacks.

Shipra Agrawal Nikhil R. Devanur

Published in: CoRR (2014)

Keyphrases

piecewise linear
multi armed bandits
convexity properties
convex functions
knapsack problem
convex concave
long term and short term
bandit problems
reinforcement learning
objective function
markov decision processes
multiarmed bandit
quadratic function
dynamic programming
stochastic systems
optimization problems
neural network
special case
machine learning
credit assignment
convex programming
real time
cost function
image segmentation
convex relaxation
computational complexity
convex optimization
linear program
online learning