Accelerating Transformer Inference for Translation via Parallel Decoding.
Andrea SantilliSilvio SeverinoEmilian PostolacheValentino MaiorcaMichele MancusiRiccardo MarinEmanuele RodolàPublished in: CoRR (2023)
Keyphrases
- bayesian networks
- fuzzy logic
- shared memory
- neural network
- decoding algorithm
- fault diagnosis
- machine translation
- inference process
- parallel implementation
- query translation
- massively parallel
- cross language information retrieval
- power transformers
- parallel programming
- bayesian model
- parallel computing
- bayesian inference
- probabilistic inference
- coding scheme
- image quality
- image segmentation