World Model on Million-Length Video And Language With RingAttention.
Hao LiuWilson YanMatei ZahariaPieter AbbeelPublished in: CoRR (2024)
Keyphrases
- world model
- semantic interpretation
- natural language
- video frames
- video sequences
- vision system
- real time
- space time
- language learning
- video data
- programming language
- multimedia
- video streams
- video database
- video clips
- video content
- semantic constraints
- spatial and temporal
- high quality
- real world
- human activities
- video surveillance
- domain knowledge
- image processing