A Modified Value Iteration Algorithm for Discounted Markov Decision Processes.
Sanaa ChafikCherki DaouiPublished in: J. Electron. Commer. Organ. (2015)
Keyphrases
- markov decision processes
- dynamic programming
- average reward
- policy iteration
- model based reinforcement learning
- optimal policy
- state space
- infinite horizon
- finite horizon
- np hard
- reachability analysis
- optimal solution
- learning algorithm
- finite state
- stochastic shortest path
- markov decision process
- computational complexity
- real time dynamic programming
- search space
- planning under uncertainty
- optimality criterion
- interval estimation
- semi markov decision processes
- discount factor
- linear programming
- discounted reward
- total reward
- expected reward
- state abstraction
- risk sensitive
- optimal control
- multistage
- monte carlo
- heuristic search
- linear program
- average cost
- partially observable
- reinforcement learning algorithms
- model free
- convergence rate
- long run