Login / Signup
Let's Fuse Step by Step: A Generative Fusion Decoding Algorithm with LLMs for Multi-modal Text Recognition.
Chan-Jan Hsu
Yi-Chang Chen
Feng-Ting Liao
Pei-Chen Ho
Yu-Hsiang Wang
Po-Chun Hsu
Da-shan Shiu
Published in:
CoRR (2024)
Keyphrases
</>
multi modal
text recognition
decoding algorithm
multi modality
viterbi algorithm
optical character recognition
single modality
generative model
hidden markov models
noise model
belief propagation
non binary
character recognition
computer vision
image annotation
text lines