CQIL: Inference Latency Optimization with Concurrent Computation of Quasi-Independent Layers.
Longwei ZouQingyang WangHan ZhaoJiangang KongYi YangYangdong DengPublished in: CoRR (2024)
Keyphrases
- efficient computation
- optimization algorithm
- bayesian networks
- inference mechanism
- dual decomposition
- case study
- evolutionary algorithm
- optimization method
- global optimization
- database
- optimization process
- probabilistic inference
- discrete optimization
- parallel computation
- constrained optimization
- optimization methods
- data sets
- multi layer
- bayesian inference
- low latency
- inference process
- combinatorial optimization
- particle swarm optimization