Sign in

Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change.

Hongfei XuJosef van GenabithDeyi XiongQiuhui Liu
Published in: ACL (2020)
Keyphrases
  • gradient direction
  • batch size
  • batch mode
  • poisson process
  • gradient magnitude
  • single item
  • batch processing
  • tensor field
  • decision making
  • similarity measure
  • visual features