Login / Signup
Understanding the Failure of Batch Normalization for Transformers in NLP.
Jiaxi Wang
Ji Wu
Lei Huang
Published in:
NeurIPS (2022)
Keyphrases
</>
natural language processing
natural language
neural network
knowledge representation
information extraction
text mining
root cause
normalization method
data sets
real world
artificial intelligence
wordnet
language processing
text processing