Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding.
Zhuoming ChenAvner MayRuslan SvirschevskiYuhsun HuangMax RyabininZhihao JiaBeidi ChenPublished in: CoRR (2024)
Keyphrases
- real time
- parameter free
- hardware and software
- computationally efficient
- low latency
- parameter tuning
- low cost
- highly efficient
- image processing
- robust estimation
- hidden markov models
- general purpose
- image quality
- database
- partial occlusion
- video sequences
- information retrieval
- memory efficient
- data sets
- vlsi implementation