Off-Policy Primal-Dual Safe Reinforcement Learning.
Zifan WuBo TangQian LinChao YuShangqin MaoQianlong XieXingxing WangDong WangPublished in: ICLR (2024)
Keyphrases
- primal dual
- reinforcement learning
- linear programming
- convex optimization
- linear program
- linear programming problems
- affine scaling
- approximation algorithms
- interior point methods
- convergence rate
- interior point algorithm
- variational inequalities
- simplex algorithm
- algorithm for linear programming
- semidefinite programming
- interior point
- duality gap
- convex programming
- infeasible interior point
- optimal policy
- learning algorithm
- simplex method
- dynamic programming
- markov decision processes
- model free
- machine learning
- convex functions
- objective function
- convex optimization problems
- special case
- sufficient conditions
- genetic algorithm
- state space
- valid inequalities
- learning problems
- total variation