Trust Region Policy Optimization with Optimal Transport Discrepancies: Duality and Algorithm for Continuous Actions.
Antonio TerpinNicolas LanzettiBatuhan YardimFlorian DörflerGiorgia RamponiPublished in: NeurIPS (2022)
Keyphrases
- optimization algorithm
- trust region
- dynamic programming
- optimal solution
- cost function
- worst case
- learning algorithm
- linear programming
- np hard
- optimization method
- image sequences
- hessian matrix
- levenberg marquardt
- objective function
- closed form
- particle swarm optimization
- optimization methods
- global optimum
- optimization procedure
- log likelihood
- maximum likelihood
- artificial neural networks