Quantifying the amount of visual information used by neural caption generators.
Marc TantiAlbert GattKenneth P. CamilleriPublished in: CoRR (2018)
Keyphrases
- visual information
- visual features
- visual processing
- visual data
- low level
- visual content
- image classification
- semantic information
- human visual system
- visual cues
- image retrieval
- image collections
- textual information
- image annotation
- visual information retrieval
- content based image retrieval systems
- low level features
- visual similarity
- video database
- content based image
- database
- multi modal
- prior knowledge
- similarity measure
- visual descriptors
- visual and textual information
- receptive fields
- audio visual
- key frames
- eye movements
- image database
- high level
- multimedia
- artificial intelligence