Hypernymization of named entity-rich captions for grounding-based multi-modal pretraining.
Giacomo NebbiaAdriana KovashkaPublished in: CoRR (2023)
Keyphrases
- multi modal
- named entities
- named entity recognition
- information extraction
- co occurrence
- named entity extraction
- natural language processing
- relation extraction
- text mining
- question answering
- annotated corpus
- multi modality
- image search
- visual features
- image annotation
- cross modal
- video search
- semantic concepts
- high level
- uni modal
- image representation
- unsupervised learning
- probabilistic model
- active learning
- high dimensional
- pairwise