Login / Signup
Learning in Structured MDPs with Convex Cost Functions: Improved Regret Bounds for Inventory Management.
Shipra Agrawal
Randy Jia
Published in:
Oper. Res. (2022)
Keyphrases
</>
reinforcement learning
inventory management
learning algorithm
cost function
online learning
supply chain
markov decision processes
machine learning
training data
dynamic programming
online convex optimization