Login / Signup
Multimodal grid features and cell pointers for Scene Text Visual Question Answering.
Lluís Gómez
Ali Furkan Biten
Rubèn Tito
Andrés Mafla
Marçal Rusiñol
Ernest Valveny
Dimosthenis Karatzas
Published in:
CoRR (2020)
Keyphrases
</>
question answering
natural language processing
low level
information extraction
natural language
semantic roles
named entities
information retrieval
feature extraction
visual information
feature set
feature vectors
question answering systems
text categorization
semantic information
co occurrence
audio visual