On the existence of stationary optimal policies for partially observed MDPs under the long-run average cost criterion.
Shun-Pin HsuDong-Ming ChuangAristotle ArapostathisPublished in: Syst. Control. Lett. (2006)
Keyphrases
- optimal policy
- partially observed
- stationary policies
- markov decision processes
- long run average cost
- markov decision process
- state space
- average cost
- decision problems
- finite horizon
- dynamic programming
- infinite horizon
- finite state
- reinforcement learning
- markov decision problems
- average reward
- inventory level
- state dependent
- multistage
- policy iteration
- long run
- reward function
- initial state
- sufficient conditions
- asymptotically optimal
- non stationary
- semi markov decision processes
- dynamic programming algorithms
- discounted reward
- stationary distribution
- expected reward
- production planning