Solution of Mdps Using Simulation-Based Value Iteration.
Mohammed Shahid AbdullaShalabh BhatnagarPublished in: AIAI (2005)
Keyphrases
- markov decision processes
- state space
- dynamic programming
- reinforcement learning
- policy iteration
- optimal policy
- markov decision process
- finite state
- heuristic search
- linear equations
- stochastic shortest path
- genetic algorithm
- integer programming
- closed form
- mathematical model
- optimal solution
- factored mdps
- learning algorithm