Login / Signup
Symmetric Dot-Product Attention for Efficient Training of BERT Language Models.
Martin Courtois
Malte Ostendorff
Leonhard Hennig
Georg Rehm
Published in:
ACL (Findings) (2024)
Keyphrases
</>
language model
language modeling
probabilistic model
dot product
document retrieval
n gram
statistical language models
language modelling
speech recognition
information retrieval
retrieval model
test collection
mixture model
query expansion
kernel function
kernel methods
smoothing methods
image features
training set
bayesian networks