ISAACS: Iterative Soft Adversarial Actor-Critic for Safety.
Kai-Chieh HsuDuy Phuong NguyenJaime Fernández FisacPublished in: L4DC (2023)
Keyphrases
- actor critic
- temporal difference
- reinforcement learning
- policy gradient
- optimal control
- importance sampling
- monte carlo
- neuro fuzzy
- approximate dynamic programming
- policy iteration
- gradient method
- average reward
- multi agent
- reinforcement learning algorithms
- dynamic programming
- sufficient conditions
- fixed point
- machine learning
- model free
- cost function
- decision making