Login / Signup
CQIL: Inference Latency Optimization with Concurrent Computation of Quasi-Independent Layers.
Longwei Zou
Qingyang Wang
Han Zhao
Jiangangkong Jiangangkong
Yi Yang
Yangdong Deng
Published in:
ACL (1) (2024)
Keyphrases
</>
global optimization
optimization process
optimization algorithm
response time
genetic algorithm
optimization method
neural network
belief networks
multi layer
efficient computation
low latency