Trust-Region-Free Policy Optimization for Stochastic Policies.
Mingfei SunBenjamin EllisAnuj MahajanSam DevlinKatja HofmannShimon WhitesonPublished in: CoRR (2023)
Keyphrases
- step size
- line search
- trust region
- control policies
- optimal policy
- global optimum
- convergence rate
- unconstrained optimization
- hessian matrix
- cost function
- global convergence
- convergence speed
- optimization methods
- echelon stock
- state space
- combinatorial optimization
- linear program
- finite number
- optimization procedure
- quadratic programming
- optimization algorithm
- optimization problems
- np hard
- newton method
- search algorithm
- support vector
- reinforcement learning