Login / Signup

DenseFormer: Enhancing Information Flow in Transformers via Depth Weighted Averaging.

Matteo PagliardiniAmirkeivan MohtashamiFrançois FleuretMartin Jaggi
Published in: CoRR (2024)
Keyphrases