Sign in

Understanding the Failure of Batch Normalization for Transformers in NLP.

Jiaxi WangJi WuLei Huang
Published in: CoRR (2022)
Keyphrases
  • natural language processing
  • databases
  • preprocessing
  • question answering
  • database
  • data sets
  • natural language
  • expert systems
  • real world
  • machine learning
  • text mining
  • co occurrence
  • language processing