Blackwell optimal policies in a Markov decision process with a Borel state space.
A. A. YushkevichPublished in: Math. Methods Oper. Res. (1994)
Keyphrases
- optimal policy
- state space
- markov decision processes
- stationary policies
- reinforcement learning
- finite horizon
- decision problems
- dynamic programming
- finite state
- heuristic search
- infinite horizon
- reward function
- reinforcement learning algorithms
- markov decision process
- state variables
- long run
- state dependent
- markov chain
- action space
- partially observable
- average reward
- search space
- policy iteration
- initial state
- multistage
- planning problems
- particle filter
- serial inventory systems
- average cost
- sufficient conditions
- lost sales
- control policies
- markov decision problems
- fixed point
- inventory level
- discounted reward
- average reward reinforcement learning