Login / Signup
Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change.
Hongfei Xu
Josef van Genabith
Deyi Xiong
Qiuhui Liu
Published in:
ACL (2020)
Keyphrases
</>
gradient direction
batch size
batch mode
poisson process
gradient magnitude
single item
batch processing
tensor field
decision making
similarity measure
visual features