Feasible Actor-Critic: Constrained Reinforcement Learning for Ensuring Statewise Safety.
Haitong MaYang GuanShengbo Eben LiXiangteng ZhangSifa ZhengJianyu ChenPublished in: CoRR (2021)
Keyphrases
- actor critic
- reinforcement learning
- temporal difference
- reinforcement learning algorithms
- policy gradient
- optimal control
- neuro fuzzy
- approximate dynamic programming
- function approximation
- policy iteration
- model free
- gradient method
- markov decision processes
- control problems
- average reward
- dynamic programming
- neural network
- partially observable
- machine learning
- optimal policy
- learning agent
- state space
- action selection
- learning problems
- fixed point
- fuzzy rules
- transfer learning
- control system
- learning algorithm
- policy gradient methods