Policy iteration for optimal switching with continuous-time dynamics.
Tohid SardarmehniAli HeydariPublished in: IJCNN (2016)
Keyphrases
- policy iteration
- optimal control
- markov decision processes
- dynamical systems
- average reward
- optimal solution
- optimal policy
- discounted reward
- sample path
- approximate dynamic programming
- model free
- reinforcement learning
- state space
- dynamic programming
- infinite horizon
- markov chain
- average cost
- least squares
- factored mdps
- policy iteration algorithm
- markov decision process
- search algorithm
- neural network