Sign in

Efficient Video Transformers via Spatial-temporal Token Merging for Action Recognition.

Zhanzhou FengJiaming XuLei MaShiliang Zhang
Published in: ACM Trans. Multim. Comput. Commun. Appl. (2024)
Keyphrases