Sign in

Learning in Structured MDPs with Convex Cost Functions: Improved Regret Bounds for Inventory Management.

Shipra AgrawalRandy Jia
Published in: EC (2019)
Keyphrases
  • inventory management
  • reinforcement learning
  • cost function
  • supply chain
  • online learning
  • learning algorithm
  • active learning
  • linear predictors
  • decision trees
  • special case
  • state space