Login / Signup
Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer Inference.
Junyan Li
Li Lyna Zhang
Jiahang Xu
Yujing Wang
Shaoguang Yan
Yunqing Xia
Yuqing Yang
Ting Cao
Hao Sun
Weiwei Deng
Qi Zhang
Mao Yang
Published in:
CoRR (2023)
Keyphrases
</>
search space
tree construction
data sets
cost effective
neural network
image sequences
pairwise
computationally efficient
power system
learning to rank
pruning strategy