Keyphrases
- optimal policy
- dynamic programming
- markov decision processes
- state space
- decision problems
- infinite horizon
- long run
- multistage
- state dependent
- finite horizon
- finite state
- dynamic programming algorithms
- average reward
- reinforcement learning
- markov decision problems
- initial state
- policy iteration
- bayesian reinforcement learning
- partially observable markov decision processes
- markov decision process
- average reward reinforcement learning
- average cost
- lost sales
- lagrangian relaxation
- linear program
- machine learning