Login / Signup
Predictive Pipelined Decoding: A Compute-Latency Trade-off for Exact LLM Decoding.
Seongjun Yang
Gibbeum Lee
Jaewoong Cho
Dimitris Papailiopoulos
Kangwook Lee
Published in:
Trans. Mach. Learn. Res. (2024)
Keyphrases
</>
trade off
decoding algorithm
decoding process
neural network
real time
information systems
lower bound
ldpc codes