Login / Signup
Learning in structured MDPs with convex cost functions: Improved regret bounds for inventory management.
Shipra Agrawal
Randy Jia
Published in:
CoRR (2019)
Keyphrases
</>
inventory management
cost function
reinforcement learning
learning algorithm
online learning
markov decision processes
convex optimization
linear programming
maximum likelihood
linear predictors