Login / Signup
Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Conservative Natural Policy Gradient Primal-Dual Algorithm.
Qinbo Bai
Amrit Singh Bedi
Vaneet Aggarwal
Published in:
AAAI (2023)
Keyphrases
</>
policy gradient
reinforcement learning
actor critic
reinforcement learning algorithms
function approximation
policy search
policy gradient methods
gradient method
optimal control
model free reinforcement learning
reinforcement learning methods
temporal difference
machine learning
average reward
function approximators
partially observable markov decision processes
learning algorithm
state action
optimal policy
state space
multi agent
single agent
control problems
approximation methods
model free
markov decision processes
variance reduction
model checking
neural network