Bandits with concave rewards and convex knapsacks.
Shipra AgrawalNikhil R. DevanurPublished in: CoRR (2014)
Keyphrases
- piecewise linear
- multi armed bandits
- convexity properties
- convex functions
- knapsack problem
- convex concave
- long term and short term
- bandit problems
- reinforcement learning
- objective function
- markov decision processes
- multiarmed bandit
- quadratic function
- dynamic programming
- stochastic systems
- optimization problems
- neural network
- special case
- machine learning
- credit assignment
- convex programming
- real time
- cost function
- image segmentation
- convex relaxation
- computational complexity
- convex optimization
- linear program
- online learning