Continuous-Time Q-Learning for Infinite-Horizon Discounted Cost Linear Quadratic Regulator Problems.
Muthukumar PalanisamyHamidreza ModaresFrank L. LewisMuhammad AurangzebPublished in: IEEE Trans. Cybern. (2015)
Keyphrases
- infinite horizon
- optimal control
- linear quadratic
- optimal policy
- average cost
- fixed cost
- dynamic programming
- reinforcement learning
- finite horizon
- state space
- holding cost
- dec pomdps
- policy iteration
- production planning
- markov decision processes
- markov decision process
- decision problems
- long run
- partially observable
- single item
- stochastic demand
- dynamical systems
- production cost
- periodic review
- inventory policy
- learning algorithm
- lost sales
- partially observable markov decision processes
- expected cost
- control strategy
- multistage
- lead time
- closed loop