Login / Signup
VLCDoC: Vision-Language contrastive pre-training model for cross-Modal document classification.
Souhail Bakkali
Zuheng Ming
Mickaël Coustaty
Marçal Rusiñol
Oriol Ramos Terrades
Published in:
Pattern Recognit. (2023)
Keyphrases
</>
document classification
probabilistic model
computer vision
cross modal
neural network
high level
multi modal
databases
machine learning
feature extraction
training data
multiscale
text mining