Login / Signup
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding.
Kenton Lee
Mandar Joshi
Iulia Turc
Hexiang Hu
Fangyu Liu
Julian Eisenschlos
Urvashi Khandelwal
Peter Shaw
Ming-Wei Chang
Kristina Toutanova
Published in:
CoRR (2022)
Keyphrases
</>
language understanding
natural language understanding
language processing
contextual constraints
semantic interpretation
dialogue system
spoken dialogue systems
natural language
general knowledge
cognitive psychology
low level
knowledge base
domain knowledge
visual processing