Constraint-Conditioned Policy Optimization for Versatile Safe Reinforcement Learning.
Yihang YaoZuxin LiuZhepeng CenJiacheng ZhuWenhao YuTingnan ZhangDing ZhaoPublished in: NeurIPS (2023)
Keyphrases
- reinforcement learning
- optimal policy
- policy search
- action selection
- optimization algorithm
- markov decision processes
- partially observable
- learning algorithm
- constrained optimization
- state and action spaces
- markov decision process
- function approximation
- global optimization
- optimization problems
- state space
- partially observable environments
- multi agent
- reinforcement learning algorithms
- temporal difference
- reward function
- model free
- decision problems
- policy gradient
- optimization process
- reinforcement learning problems
- optimization method
- cost function
- penalty function
- reinforcement learning methods
- objective function