Accelerating Transformer Inference for Translation via Parallel Decoding.
Andrea SantilliSilvio SeverinoEmilian PostolacheValentino MaiorcaMichele MancusiRiccardo MarinEmanuele RodolàPublished in: ACL (1) (2023)
Keyphrases
- probabilistic inference
- machine translation
- parallel processing
- parallel computing
- fuzzy logic
- parallel computation
- shared memory
- finite state transducers
- distributed memory
- bayesian inference
- fault diagnosis
- probability distribution
- bayesian networks
- belief networks
- data sets
- inference engine
- query translation
- massively parallel
- dynamic bayesian networks
- power system
- inference process
- parallel execution