Let's Fuse Step by Step: A Generative Fusion Decoding Algorithm with LLMs for Multi-modal Text Recognition.

Published in: CoRR (2024)

Keyphrases