Login / Signup
On the Convergence of Encoder-only Shallow Transformers.
Yongtao Wu
Fanghui Liu
Grigorios G. Chrysos
Volkan Cevher
Published in:
CoRR (2023)
Keyphrases
</>
bit rate
information extraction
question answering
low complexity
rate distortion
natural language processing
convergence rate
data sets
information systems
knowledge base
multiscale
macroblock
faster convergence
syntactic parsing
deep knowledge