Trust region policy optimization via entropy regularization for Kullback-Leibler divergence constraint.
Haotian XuJunyu XuanGuangquan ZhangJie LuPublished in: Neurocomputing (2024)
Keyphrases
- kullback leibler divergence
- trust region
- mutual information
- optimization methods
- information theoretic
- information theory
- kl divergence
- line search
- probability density function
- distance measure
- risk minimization
- global optimum
- optimization problems
- log likelihood
- optimization method
- global convergence
- feature selection
- newton method
- marginal distributions
- column generation