Computing Stabilizing Linear Controllers via Policy Iteration.
Andrew G. LamperskiPublished in: CDC (2020)
Keyphrases
- average reward
- policy iteration
- reinforcement learning
- markov decision processes
- linear approximation
- optimal policy
- model free
- sample path
- fixed point
- finite state
- least squares
- temporal difference
- policy evaluation
- optimal control
- infinite horizon
- state space
- markov decision process
- function approximation
- convergence rate
- control strategy
- sufficient conditions