Login / Signup
SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment.
Ziping Ma
Furong Xu
Jian Liu
Ming Yang
Qingpei Guo
Published in:
CoRR (2024)
Keyphrases
</>
multi modal
human visual system
multimodal interaction
multimodal data
word alignment
neural network
empirically derived
multimodal information
image alignment
dynamic time warping
image quality
mutual information
multimodal interfaces
multiscale
global alignment
image sequences
data sets
procrustes analysis