Login / Signup
Optimistic reinforcement learning by forward Kullback-Leibler divergence optimization.
Taisuke Kobayashi
Published in:
Neural Networks (2022)
Keyphrases
</>
kullback leibler divergence
reinforcement learning
mutual information
information theoretic
kl divergence
probability density function
information theory
distance measure
supervised learning
non stationary
marginal distributions