Login / Signup

Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer Inference.

Junyan LiLi Lyna ZhangJiahang XuYujing WangShaoguang YanYunqing XiaYuqing YangTing CaoHao SunWeiwei DengQi ZhangMao Yang
Published in: KDD (2023)
Keyphrases