PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation.
Branden ButlerSixing YuArya MazaheriAli JannesariPublished in: CoRR (2024)
Keyphrases
- bayesian networks
- inference problems
- inference mechanism
- efficient learning
- probabilistic inference
- database
- neural network
- inference process
- inference engine
- random fields
- asynchronous circuits
- parallel architecture
- grammatical inference
- structured prediction
- bayesian inference
- hidden markov models
- special case
- relational databases
- expert systems
- multiscale
- decision trees
- website
- computer vision
- databases