LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding.
Yanzhe ZhangRuiyi ZhangJiuxiang GuYufan ZhouNedim LipkaDiyi YangTong SunPublished in: CoRR (2023)
Keyphrases
- image understanding
- image annotation
- image interpretation
- object recognition
- computer vision
- visual information
- web images
- object detection
- image analysis
- high level
- control structure
- image segmentation
- text mining
- low level vision
- free text
- information retrieval
- image description
- semantic content
- visual perception
- image analysis and computer vision
- image processing
- text retrieval
- computer technology
- image database
- multimedia
- image restoration
- visual features
- low level
- pattern recognition
- keywords
- feature extraction
- computational vision
- database