Login / Signup

Visual Anchors Are Strong Information Aggregators For Multimodal Large Language Model.

Haogeng LiuQuanzeng YouXiaotian HanYongfei LiuHuaibo HuangRan HeHongxia Yang
Published in: CoRR (2024)
Keyphrases
  • language model
  • context sensitive
  • language modeling
  • information retrieval
  • probabilistic model
  • visual features
  • mixture model
  • web documents
  • speech recognition
  • test collection