Sign in

UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation.

Haiyang WangHao TangShaoshuai ShiAoxue LiZhenguo LiBernt SchieleLiwei Wang
Published in: ICCV (2023)
Keyphrases
  • multi modal
  • multi modality
  • audio visual
  • semantic concepts
  • video search
  • cross modal
  • high dimensional
  • eye movements
  • fusing multiple
  • feature selection
  • mutual information
  • image classification
  • visual recognition