Markov decision processes with state-dependent discount factors and unbounded rewards/costs.
Qingda WeiXianping GuoPublished in: Oper. Res. Lett. (2011)
Keyphrases
- markov decision processes
- state dependent
- optimal policy
- average cost
- state space
- reinforcement learning
- decision problems
- finite state
- dynamic programming
- long run
- infinite horizon
- average reward
- steady state
- finite horizon
- markov decision process
- policy iteration
- reward function
- transition matrices
- continuous state
- expected cost
- action space
- queueing networks
- decision theoretic planning
- multistage
- sufficient conditions
- markov decision problems
- initial state
- partially observable
- single server
- discounted reward
- inventory level
- expected reward
- real time dynamic programming
- asymptotically optimal
- total cost
- production cost
- queue length
- lead time
- special case
- total reward
- computational complexity
- decision making
- data mining