ProVLA: Compositional Image Search with Progressive Vision-Language Alignment and Multimodal Fusion.
Zhizhang HuXinliang ZhuSon TranRené VidalArnab DhuaPublished in: ICCV (Workshops) (2023)
Keyphrases
- image search
- multimodal fusion
- relevance feedback
- image classification
- image retrieval
- image annotation
- multimodal interfaces
- image collections
- visual features
- computer vision
- web images
- web image search
- high robustness
- human computer interaction
- vision system
- image processing
- human centered
- active learning
- user interface