On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes.
Emmanuel Fernández-GaucherandAristotle ArapostathisSteven I. MarcusPublished in: Ann. Oper. Res. (1991)
Keyphrases
- average cost
- optimal policy
- partially observable markov decision processes
- finite state
- markov decision processes
- infinite horizon
- finite horizon
- long run
- decision problems
- bayesian reinforcement learning
- reinforcement learning
- state space
- average reward
- dynamic programming
- initial state
- state dependent
- multistage
- policy iteration
- sufficient conditions
- inventory models
- optimal control
- finite number
- markov decision process
- holding cost
- control policies
- setup cost
- markov decision problems
- model checking
- linear programming
- linear program
- bayesian networks
- asymptotically optimal
- partially observable
- control policy
- supply chain
- lost sales
- total cost
- heuristic search
- data mining