Trust Region Policy Optimization with Optimal Transport Discrepancies: Duality and Algorithm for Continuous Actions.
Antonio TerpinNicolas LanzettiBatuhan YardimFlorian DörflerGiorgia RamponiPublished in: CoRR (2022)
Keyphrases
- trust region
- dynamic programming
- optimization algorithm
- learning algorithm
- optimal solution
- cost function
- worst case
- linear programming
- optimization method
- global optimum
- objective function
- optimization methods
- faster convergence
- image sequences
- multi objective
- np hard
- support vector
- levenberg marquardt
- line search
- evolutionary algorithm
- lower bound
- particle swarm