Publication: UMTIT: Unifying Recognition, Translation, and Generation for Multimodal Text Image Translation.