Semi-infinite discounted Markov decision processes: Policy improvement and singular perturbations.
Mohammed AbbadKhalid RahhaliPublished in: Math. Methods Oper. Res. (2001)
Keyphrases
- markov decision processes
- semi infinite
- optimal policy
- linear program
- policy iteration
- infinite horizon
- markov decision process
- average reward
- average cost
- finite horizon
- state space
- state and action spaces
- dynamic programming
- partially observable
- stationary policies
- reward function
- finite state
- discounted reward
- action space
- total reward
- reinforcement learning
- discount factor
- transition matrices
- semidefinite
- optimality conditions
- linear programming
- long run
- reinforcement learning algorithms
- markov decision problems
- expected reward
- partially observable markov decision processes
- sufficient conditions
- quadratic program
- multistage
- support vector machine
- search space
- computational complexity
- objective function