Login / Signup
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints.
Joshua Ainslie
James Lee-Thorp
Michiel de Jong
Yury Zemlyanskiy
Federico Lebrón
Sumit Sanghai
Published in:
CoRR (2023)
Keyphrases
</>
real time
statistical models
information retrieval
decision trees
probabilistic model
structured prediction
complex systems
database
video sequences
text classification
query expansion
parameter estimation
query evaluation
search queries
database queries