ZeroVL: A Strong Baseline for Aligning Vision-Language Representations with Limited Resources.
Quan CuiBoyan ZhouYu GuoWeidong YinHao WuOsamu YoshiePublished in: CoRR (2021)
Keyphrases
- limited resources
- processing power
- computing resources
- programming language
- real time
- computer vision
- resource limitations
- image processing
- vision system
- natural language
- image registration
- language learning
- embedded systems
- relative improvement
- multiple representations
- mobile phone
- computational linguistics
- higher level
- software development
- relational databases
- high level
- data sets