Constraint-Conditioned Policy Optimization for Versatile Safe Reinforcement Learning.
Yihang YaoZuxin LiuZhepeng CenJiacheng ZhuWenhao YuTingnan ZhangDing ZhaoPublished in: CoRR (2023)
Keyphrases
- reinforcement learning
- optimal policy
- markov decision process
- markov decision processes
- state space
- optimization algorithm
- action selection
- state and action spaces
- policy search
- sequential quadratic programming
- optimization problems
- action space
- global optimization
- reinforcement learning problems
- transition model
- policy gradient
- state action
- function approximators
- objective function
- penalty function
- partially observable environments
- sufficient conditions
- average cost
- linear constraints
- model free
- constrained optimization
- function approximation
- actor critic
- optimization method
- mobile robot
- evolutionary algorithm
- learning process
- multi agent