Login / Signup

ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models.

Chunjiang GeSijie ChengZiming WangJiale YuanYuan GaoJun SongShiji SongGao HuangBo Zheng
Published in: CoRR (2024)
Keyphrases
  • data sets
  • prior knowledge
  • visual features
  • classification models
  • machine learning
  • low level
  • multi modal
  • statistical model
  • statistical models
  • visual information
  • bayesian framework
  • structural model
  • visual tasks