Login / Signup
Transformer Dissection: An Unified Understanding for Transformer's Attention via the Lens of Kernel.
Yao-Hung Hubert Tsai
Shaojie Bai
Makoto Yamada
Louis-Philippe Morency
Ruslan Salakhutdinov
Published in:
EMNLP/IJCNLP (1) (2019)
Keyphrases
</>
fuzzy logic
fault diagnosis
power transformers
high voltage
incipient fault
power system
partial discharge
focus of attention
unified model
distribution network
learning algorithm
support vector
feature vectors
gaussian processes
kernel density estimation
sparse kernel