Exploring Better Text Image Translation with Multimodal Codebook.
Zhibin LanJiawei YuXiang LiWen ZhangJian LuanBin WangDegen HuangJinsong SuPublished in: CoRR (2023)
Keyphrases
- image representation
- image features
- image analysis
- image data
- multiscale
- image classification
- image content
- input image
- single image
- high resolution
- text information
- multi modal
- bag of words
- segmentation method
- image retrieval
- low level
- web images
- image collections
- region of interest
- vector quantized
- text retrieval
- vector quantization
- image segmentation
- training set
- image restoration
- test images
- visual features
- image compression
- pixel values
- edge detection
- scanned documents
- natural language processing