Sign in

VILA: Learning Image Aesthetics from User Comments with Vision-Language Pretraining.

Junjie KeKeren YeJiahui YuYonghui WuPeyman MilanfarFeng Yang
Published in: CVPR (2023)
Keyphrases
  • image data
  • input image
  • image retrieval
  • image features
  • multiscale
  • spatial information
  • low level
  • image classification
  • vision system
  • unsupervised learning
  • image representation
  • question answering