Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization.
Tiezheng YuWenliang DaiZihan LiuPascale FungPublished in: CoRR (2021)
Keyphrases
- language model
- vision guided
- pre trained
- mobile robot navigation
- language modeling
- training data
- probabilistic model
- natural scenes
- generative model
- n gram
- information retrieval
- training examples
- query expansion
- speech recognition
- mobile robot
- language modeling framework
- multi modal
- control signals
- smoothing methods
- multimedia
- human body
- natural images