Login / Signup
Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns.
Brian DuSell
David Chiang
Published in:
ICLR (2024)
Keyphrases
</>
theoretical framework
mathematical model
parameter estimation
computational model
experimental data
objective function
probabilistic model
theoretical analysis
statistical model
hierarchical model
neural network
website
pattern mining
neural network model
prediction model
structural model