Performance Guarantees for Empirical Markov Decision Processes with Applications to Multiperiod Inventory Models.

William L. Cooper Bharath Rangarajan

Published in: Oper. Res. (2012)

Keyphrases

inventory models
markov decision processes
finite horizon
average cost
optimal policy
markov decision process
inventory control
finite state
state space
single period
initial state
reinforcement learning
infinite horizon
dynamic programming
policy iteration
lost sales
control policies
partially observable
average reward
action space
machine learning
decision problems
multi agent
reward function
state dependent
long run
single item
cost function
fixed cost
service level