Persistently Optimal Policies in Stochastic Dynamic Programming with Generalized Discounting.
Anna JaskiewiczJanusz MatkowskiAndrzej S. NowakPublished in: Math. Oper. Res. (2013)
Keyphrases
- optimal policy
- stochastic dynamic programming
- approximate dynamic programming
- decision problems
- influence diagrams
- continuous state
- dynamic programming
- reinforcement learning
- state dependent
- policy iteration
- markov decision processes
- average cost
- state space
- finite state
- finite horizon
- partially observable markov decision processes
- control policies
- long run
- multistage
- average reward
- infinite horizon
- initial state
- markov decision problems
- experimental design
- markov decision process
- reward function
- model free
- least squares
- linear program