CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model.
Shuai ZhaoXiaohan WangLinchao ZhuYi YangPublished in: CoRR (2023)
Keyphrases
- language model
- language modeling
- pre trained
- n gram
- scene text recognition
- speech recognition
- document retrieval
- probabilistic model
- information retrieval
- word error rate
- retrieval model
- test collection
- query expansion
- context sensitive
- ad hoc information retrieval
- query terms
- relevance model
- mixture model
- computer vision
- training data
- smoothing methods
- trec collections
- semi supervised
- learning algorithm