FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning.

Published in: ICLR (2024)

Keyphrases

visual attention
shared memory
parallel processing
information retrieval
focus of attention
evolutionary algorithm
case study
databases
multi agent
data structure
pairwise
website
query processing
mobile robot
information systems
learning algorithm
genetic algorithm
massively parallel