Model-based reinforcement learning for infinite-horizon approximate optimal tracking.
Rushikesh KamalapurkarLindsey AndrewsPatrick WaltersWarren E. DixonPublished in: CDC (2014)
Keyphrases
- infinite horizon
- model based reinforcement learning
- markov decision processes
- finite horizon
- optimal control
- dynamic programming
- stochastic demand
- average cost
- single item
- optimal policy
- long run
- fixed cost
- inventory policy
- state space
- single product
- finite state
- markov decision process
- reinforcement learning
- partially observable
- decision problems
- holding cost
- lost sales
- inventory control
- markov decision problems
- expected cost
- multistage
- optimal solution
- policy iteration
- inventory level
- ordering cost