Hypernymization of named entity-rich captions for grounding-based multi-modal pretraining.
Giacomo NebbiaAdriana KovashkaPublished in: ICMR (2023)
Keyphrases
- multi modal
- named entities
- named entity recognition
- information extraction
- named entity extraction
- co occurrence
- text mining
- question answering
- relation extraction
- natural language processing
- unsupervised learning
- annotated corpus
- multi modality
- visual features
- cross modal
- image annotation
- video content
- natural language
- supervised learning
- high dimensional
- data mining
- image search
- semantic concepts
- high level
- knowledge base
- computer vision