Login / Signup

Parallel Attention and Feed-Forward Net Design for Pre-training and Inference on Transformers.

Shashank SonkarRichard G. Baraniuk
Published in: CoRR (2023)
Keyphrases