Login / Signup
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer.
Ding Jia
Jianyuan Guo
Kai Han
Han Wu
Chao Zhang
Chang Xu
Xinghao Chen
Published in:
CoRR (2024)
Keyphrases
</>
pixel wise
multimodal fusion
computer vision
vision system
human computer interaction
image processing
multimodal interfaces
multiscale