Approximately optimal policies for a class of Markov decision problems with applications to energy harvesting.
Dor ShavivAyfer ÖzgürPublished in: WiOpt (2017)
Keyphrases
- optimal policy
- markov decision problems
- markov decision processes
- state space
- decision problems
- reinforcement learning
- dynamic programming
- finite horizon
- infinite horizon
- sufficient conditions
- long run
- finite state
- average reward
- dynamic programming algorithms
- policy iteration
- multistage
- partially observable
- average cost
- markov decision process
- linear programming
- reward function
- steady state
- initial state