Optimal Policies for Payment of Dividends through a Fixed Barrier at Discrete Time.
Raúl Montes-de-OcaPatricia SaavedraGabriel Zacarías-EspinozaDaniel Cruz-SuárezPublished in: ICORES (2017)
Keyphrases
- optimal policy
- finite state
- semi markov decision processes
- markov decision processes
- state space
- decision problems
- finite horizon
- dynamic programming
- average reward
- reinforcement learning
- markov chain
- multistage
- sufficient conditions
- infinite horizon
- policy iteration
- state dependent
- long run
- average reward reinforcement learning
- markov decision process
- initial state
- control policies
- markov decision problems
- stationary policies
- dynamic programming algorithms
- serial inventory systems
- average cost
- partially observable markov decision processes
- ordering cost
- steady state
- data mining