V*: Guided Visual Search as a Core Mechanism in Multimodal LLMs.
Penghao WuSaining XiePublished in: CoRR (2023)
Keyphrases
- visual search
- eye movements
- visual attention
- active vision
- human computer interactions
- image retrieval
- eye tracking
- visual environment
- human computer interaction
- visual search tasks
- target detection
- visual processing
- eye tracker
- video retrieval
- object recognition
- human vision
- learning mechanism
- artificial intelligence
- visual information
- pre attentive