Monotonic and Insensitive Optimal Policies for Control of Queues with Undiscounted Costs.
Shaler Stidham Jr.Richard R. WeberPublished in: Oper. Res. (1989)
Keyphrases
- optimal policy
- markov decision processes
- infinite horizon
- average reward
- average cost
- policy iteration
- control policies
- long run
- reinforcement learning
- finite horizon
- decision problems
- markov decision problems
- optimal control
- state space
- finite state
- fixed cost
- dynamic programming
- multistage
- single item
- stochastic demand
- dynamic programming algorithms
- expected cost
- markov decision process
- control system
- periodic review
- sufficient conditions
- total reward
- serial inventory systems
- control policy
- state dependent
- initial state
- partially observable
- average reward reinforcement learning
- optimal solution
- holding cost
- action space
- queueing networks
- sample path
- least squares
- semi markov decision processes