Lumos : Empowering Multimodal LLMs with Scene Text Recognition.
Ashish ShenoyYichao LuSrihari JayakumarDebojeet ChatterjeeMohsen MoslehpourPierce ChuangAbhay HarpaleVikas BhardwajDi XuShicong ZhaoLongfang ZhaoAnkit RamchandaniXin Luna DongAnuj KumarPublished in: CoRR (2024)