Constraint-Generation Policy Optimization (CGPO): Nonlinear Programming for Policy Optimization in Mixed Discrete-Continuous MDPs.
Michael GimelfarbAyal TaitlerScott SannerPublished in: CoRR (2024)
Keyphrases
- nonlinear programming
- linear constraints
- optimal policy
- optimization problems
- linear programming
- inequality constraints
- reinforcement learning
- markov decision processes
- constrained optimization
- equality constraints
- markov decision problems
- optimality conditions
- reward function
- equality and inequality constraints
- continuous state spaces
- sensitivity analysis
- least squares
- dynamic programming