Natural Gradient Policy for Average Cost SMDP Problem.
Ngo Anh VienTaeChoong ChungPublished in: ICTAI (1) (2007)
Keyphrases
- average cost
- markov decision problems
- natural gradient
- optimal policy
- markov decision processes
- policy gradient
- average reward
- long run
- independent component analysis
- learning rate
- optimal control
- finite state
- finite number
- infinite horizon
- blind source separation
- policy iteration
- approximate dynamic programming
- total cost
- linear programming
- initial state
- decision problems
- control policy
- multistage
- markov decision process
- fixed point
- state space
- dynamic programming
- linear program
- reinforcement learning
- partially observable
- reinforcement learning algorithms
- function approximation
- search space
- decision processes
- partially observable markov decision processes
- markov chain
- sufficient conditions