Computing Stabilizing Feedback Gains via a Model-Free Policy Gradient Method.
Ibrahim Kurban ÖzaslanHesameddin MohammadiMihailo R. JovanovicPublished in: IEEE Control. Syst. Lett. (2023)
Keyphrases
- model free
- gradient method
- policy iteration
- policy gradient
- reinforcement learning
- policy evaluation
- reinforcement learning algorithms
- function approximation
- convergence rate
- average reward
- temporal difference
- optimization methods
- optimal policy
- rl algorithms
- negative matrix factorization
- step size
- dynamic programming
- genetic algorithm
- cost function