Keyphrases
- risk sensitive
- markov decision processes
- finite horizon
- average cost
- state space
- optimal control
- optimal policy
- infinite horizon
- finite state
- dynamic programming
- control policies
- markov decision process
- reinforcement learning
- policy iteration
- lost sales
- markov chain
- reinforcement learning algorithms
- partially observable
- action space
- markov decision problems
- long run
- average reward
- dynamical systems
- initial state
- state transitions
- linear programming
- learning algorithm
- finite number
- total cost
- real valued