Policy Gradient Based Semi-Markov Decision Problems: Approximation and Estimation Errors.
Ngo Anh VienSeungGwan LeeTaeChoong ChungPublished in: IEICE Trans. Inf. Syst. (2010)
Keyphrases
- markov decision problems
- estimation error
- queueing networks
- optimal policy
- linear programming
- state space
- reinforcement learning
- approximation methods
- partially observable
- decision processes
- decision theoretic
- expected utility
- markov decision processes
- utility function
- error rate
- standard deviation
- covariance matrix
- transition probabilities
- average cost
- reward function
- dynamic programming
- long run
- function approximators
- linear program
- action space
- steady state
- infinite horizon
- model free
- policy iteration
- decision making
- np hard
- supervised learning
- dynamical systems
- decision problems
- function approximation
- learning algorithm
- fixed point