Login / Signup

Causal Interpretation of Self-Attention in Pre-Trained Transformers.

Raanan Y. RohekarYaniv GurwiczShami Nisimov
Published in: CoRR (2023)
Keyphrases
  • pre trained
  • training data
  • training examples
  • control signals
  • focus of attention
  • data sets
  • neural network
  • bayesian networks
  • reinforcement learning
  • pairwise
  • active learning
  • generative model
  • face detection