Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline.
Zangwei ZhengXiaozhe RenFuzhao XueYang LuoXin JiangYang YouPublished in: CoRR (2023)
Keyphrases
- fixed length
- scheduling problem
- scheduling algorithm
- genetic algorithm
- bayesian networks
- resource constraints
- parallel machines
- resource allocation
- probabilistic inference
- real time
- inference engine
- bayesian inference
- processing pipeline
- total length
- flexible manufacturing systems
- inference process
- human perception
- random fields
- petri net
- neural network