CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers.
Wenyi HongMing DingWendi ZhengXinghan LiuJie TangPublished in: CoRR (2022)
Keyphrases
- video collections
- text generation
- natural language descriptions
- video retrieval
- multimedia
- video search
- text detection
- video content
- video segments
- video clips
- video data
- news video
- digital video
- video database
- video sequences
- real world
- real time
- text retrieval
- video frames
- text mining
- multimedia documents
- video analysis
- small scale
- database
- video streams
- free text
- spatial and temporal
- semantic labels
- text documents
- semantic concepts
- space time
- text data
- textual descriptions
- text information
- natural language
- audio content