LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding.
Mostafa ElhoushiAkshat ShrivastavaDiana LiskovichBasil HosmerBram WastiLiangzhen LaiAnas MahmoudBilge AcunSaurabh AgarwalAhmed RomanAhmed A AlyBeidi ChenCarole-Jean WuPublished in: ACL (1) (2024)