Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences.
Alan ChanHugo SilvaSungsu LimTadashi KozunoA. Rupam MahmoodMartha WhitePublished in: J. Mach. Learn. Res. (2022)
Keyphrases
- kullback leibler
- optimization algorithm
- optimization problems
- optimization process
- optimization method
- bi directional
- optimization model
- evolutionary algorithm
- kl divergence
- combinatorial optimization
- direct search
- policy making
- action selection
- constrained optimization
- data sets
- statistical models
- distance measure
- probability distribution
- neural network