Number of Attention Heads vs Number of Transformer-Encoders in Computer Vision.

Tomas HrycejBernhard BermeitingerSiegfried Handschuh
Published in: CoRR (2022)
Keyphrases