Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline.

Zangwei Zheng Xiaozhe Ren Fuzhao Xue Yang Luo Xin Jiang Yang You

Published in: CoRR (2023)

Keyphrases

fixed length
scheduling problem
scheduling algorithm
genetic algorithm
bayesian networks
resource constraints
parallel machines
resource allocation
probabilistic inference
real time
inference engine
bayesian inference
processing pipeline
total length
flexible manufacturing systems
inference process
human perception
random fields
petri net
neural network