DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale.
Reza Yazdani AminabadiSamyam RajbhandariMinjia ZhangAmmar Ahmad AwanCheng LiDu LiElton ZhengJeff RasleyShaden SmithOlatunji RuwaseYuxiong HePublished in: CoRR (2022)
Keyphrases
- efficient inference
- probabilistic inference
- fully connected
- structured prediction
- human pose estimation
- factor graphs
- markov random field
- exact inference
- hidden variables
- bayesian networks
- conditional random fields
- approximate inference
- linear models
- maximum margin
- graph structure
- prior knowledge
- probabilistic model
- influence diagrams
- belief networks
- pose estimation