Login / Signup
Word2Pix: Word to Pixel Cross Attention Transformer in Visual Grounding.
Heng Zhao
Joey Tianyi Zhou
Yew-Soon Ong
Published in:
CoRR (2021)
Keyphrases
</>
co occurrence
n gram
word recognition
word sense disambiguation
neural network
visual features
visual information
selective attention
related words
fuzzy logic
visual attention
image pixels
sentence level