Sign in

Improving Deep Transformer with Depth-Scaled Initialization and Merged Attention.

Biao ZhangIvan TitovRico Sennrich
Published in: EMNLP/IJCNLP (1) (2019)
Keyphrases