Distributed Inference Performance Optimization for LLMs on CPUs.
Pujiang HeShan ZhouChangqing LiWenhuan HuangWeifei YuDuyi WangChen MengSheng GuiPublished in: CoRR (2024)
Keyphrases
- distributed systems
- bayesian networks
- optimization algorithm
- multi agent
- optimization process
- probabilistic inference
- global optimization
- autonomy oriented computing
- dual decomposition
- distributed network
- agent technology
- fault tolerant
- computer networks
- database
- distributed environment
- optimization method
- mobile agents
- optimization problems
- cooperative
- load balancing
- parallel processing
- convex optimization
- optimization methods
- constrained optimization
- hidden markov models
- inference process
- discrete optimization
- multi objective
- databases