Multiconstrained Finite-Horizon Piecewise Deterministic Markov Decision Processes with Unbounded Transition Rates.
Yonghui HuangXianping GuoPublished in: Math. Oper. Res. (2020)
Keyphrases
- finite horizon
- markov decision processes
- optimal policy
- optimal stopping
- infinite horizon
- state space
- finite state
- average cost
- reinforcement learning
- dynamic programming
- stationary policies
- average reward
- transition matrices
- partially observable
- markov decision process
- control policies
- policy iteration
- decision theoretic planning
- reward function
- lost sales
- least squares
- action space
- bayesian networks