• search
    search
  • reviewers
    reviewers
  • feeds
    feeds
  • assignments
    assignments
  • settings
  • logout

On the Expressivity Role of LayerNorm in Transformers' Attention.

Shaked BrodyUri AlonEran Yahav
Published in: CoRR (2023)
Keyphrases
  • special case
  • real time
  • neural network
  • computer vision
  • similarity measure
  • bayesian networks
  • multiscale
  • digital libraries
  • visual attention