Login / Signup

Lumos : Empowering Multimodal LLMs with Scene Text Recognition.

Ashish ShenoyYichao LuSrihari JayakumarDebojeet ChatterjeeMohsen MoslehpourPierce ChuangAbhay HarpaleVikas BhardwajDi XuShicong ZhaoLongfang ZhaoAnkit RamchandaniXin Luna DongAnuj Kumar
Published in: CoRR (2024)
Keyphrases
  • scene text recognition
  • object recognition
  • multi modal
  • multimodal interaction
  • audio visual
  • high dimensional
  • brain image analysis
  • database
  • information systems
  • decision making
  • digital libraries
  • preprocessing