Pre-Training Multi-Modal Dense Retrievers for Outside-Knowledge Visual Question Answering.
Alireza SalemiMahta RafieeHamed ZamaniPublished in: CoRR (2023)
Keyphrases
- multi modal
- question answering
- cross modal
- information retrieval
- question classification
- domain knowledge
- natural language processing
- syntactic information
- information extraction
- passage retrieval
- video search
- cross language
- natural language
- natural language questions
- knowledge base
- audio visual
- visual information
- multi modality
- single modality
- knowledge representation
- question answering systems
- machine translation
- visual features
- semantic roles
- high dimensional
- multiple modalities
- answering questions
- training set
- speech transcripts