Revisiting Attention Weights as Explanations from an Information Theoretic Perspective.
Bingyang WenK. P. SubbalakshmiFan YangPublished in: CoRR (2022)
Keyphrases
- information theoretic
- information theory
- mutual information
- theoretic framework
- information bottleneck
- information theoretic measures
- relative entropy
- multi modality
- entropy measure
- minimum description length
- jensen shannon divergence
- kullback leibler divergence
- log likelihood
- computational learning theory
- linear combination
- kl divergence
- bregman divergences
- distance measure
- pattern recognition