Login / Signup
Research on image caption generation method based on multi-modal pre-training model and text mixup optimization.
Jing-Tao Sun
Xuan Min
Published in:
Signal Image Video Process. (2024)
Keyphrases
</>
multi modal
generation method
image retrieval
image classification
image content
image segmentation
input image
image analysis
multiple modalities
video search
uni modal
spatial context
cross modal
audio visual
image annotation
graph cuts
image features
feature extraction