Estimate and approximate policy iteration algorithm for discounted Markov decision models with bounded costs and Borel spaces.
M. Teresa Robles-AlcarázOscar Vega-AmayaJ. Adolfo Minjárez-SosaPublished in: Risk Decis. Anal. (2017)
Keyphrases
- decision models
- markov decision processes
- average cost
- decision model
- policy iteration
- influence diagrams
- decision theoretic
- policy evaluation
- finite state
- markov chain
- model construction
- dynamic programming
- expected utility
- infinite horizon
- optimal policy
- learning algorithm
- decision problems
- cooperative
- probabilistic reasoning
- reinforcement learning
- state space
- markov decision process
- partially observable
- decision making
- multi agent
- search algorithm
- average reward
- machine learning
- data mining