Dual Modalities of Text: Visual and Textual Generative Pre-training.
Yekun ChaiQingyi LiuJingwu XiaoShuohuan WangYu SunHua WuPublished in: CoRR (2024)
Keyphrases
- textual data
- textual information
- visual information
- textual features
- free text
- visual representations
- text mining
- text analytics
- multiple modalities
- visual features
- plain text
- training set
- textual case based reasoning
- keywords
- cross modal
- multi modal
- text retrieval
- web images
- video search
- manually constructed
- low level
- semantic content
- training examples
- discriminative training
- natural language
- news video
- textual descriptions
- information retrieval
- visual data
- visual content
- generative model
- visual and textual information
- textual and visual information
- visual and textual features
- text content
- discriminative classifiers
- visual cues
- training process
- text documents
- structured data
- unsupervised learning
- natural language processing
- information extraction