Lumos: Empowering Multimodal LLMs with Scene Text Recognition.
Ashish ShenoyYichao LuSrihari JayakumarDebojeet ChatterjeeMohsen MoslehpourPierce ChuangAbhay HarpaleVikas BhardwajDi XuShicong ZhaoLongfang ZhaoAnkit RamchandaniXin Luna DongAnuj KumarPublished in: KDD (2024)