• search
    search
  • reviewers
    reviewers
  • feeds
    feeds
  • assignments
    assignments
  • settings
  • logout

Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer Inference.

Junyan LiLi Lyna ZhangJiahang XuYujing WangShaoguang YanYunqing XiaYuqing YangTing CaoHao SunWeiwei DengQi ZhangMao Yang
Published in: KDD (2023)
Keyphrases