Learning in Structured MDPs with Convex Cost Functions: Improved Regret Bounds for Inventory Management.

Shipra Agrawal Randy Jia

Published in: EC (2019)

Keyphrases

inventory management
reinforcement learning
cost function
supply chain
online learning
learning algorithm
active learning
linear predictors
decision trees
special case
state space