Login / Signup

GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer.

Ding JiaJianyuan GuoKai HanHan WuChao ZhangChang XuXinghao Chen
Published in: CoRR (2024)
Keyphrases
  • pixel wise
  • multimodal fusion
  • computer vision
  • vision system
  • human computer interaction
  • image processing
  • multimodal interfaces
  • multiscale