MaxFusion: Plug&Play Multi-Modal Generation in Text-to-Image Diffusion Models.
Nithin Gopalakrishnan NairJeya Maria Jose ValanarasuVishal M. PatelPublished in: CoRR (2024)
Keyphrases
- multi modal
- multiple modalities
- auto annotation
- image data
- input image
- uni modal
- edge detection
- image content
- multi modality
- video search
- image classification
- high resolution
- image analysis
- image retrieval
- image annotation
- web images
- multiscale
- image segmentation
- diffusion models
- single modality
- image representation
- vector field
- semantic concepts
- cross modal
- objective function
- keywords