A note on 'monotone optimal policies for markov decision processes'.
Dieter KalinPublished in: Math. Program. (1978)
Keyphrases
- markov decision processes
- optimal policy
- state space
- decision problems
- finite state
- reinforcement learning
- finite horizon
- dynamic programming
- policy iteration
- long run
- average cost
- average reward
- state dependent
- reinforcement learning algorithms
- multistage
- infinite horizon
- markov decision problems
- partially observable
- markov decision process
- state and action spaces
- sufficient conditions
- decision processes
- action space
- partially observable markov decision processes
- initial state
- control policies
- discount factor
- discounted reward
- policy evaluation
- markov chain
- inventory level
- monte carlo
- heuristic search
- reward function