Constrained Variational Policy Optimization for Safe Reinforcement Learning.
Zuxin LiuZhepeng CenVladislav IsenbaevWei LiuZhiwei Steven WuBo LiDing ZhaoPublished in: CoRR (2022)
Keyphrases
- reinforcement learning
- optimal policy
- concave convex procedure
- policy search
- optimization algorithm
- action selection
- policy gradient
- markov decision process
- global optimization
- machine learning
- optimization problems
- image segmentation
- optimization method
- partially observable environments
- markov decision problems
- partially observable
- markov decision processes
- policy evaluation
- constrained optimization
- actor critic
- policy iteration
- approximate dynamic programming
- function approximators
- lagrange multipliers
- action space
- dynamic programming
- reward function
- learning algorithm
- reinforcement learning problems
- state and action spaces
- neural network
- model free
- finite state
- optimization process
- function approximation
- linear programming
- decision problems