Pandora: Towards General World Model with Natural Language Actions and Video States.
Jiannan XiangGuangyi LiuYi GuQiyue GaoYuting NingYuheng ZhaZeyu FengTianhua TaoShibo HaoYemin ShiZhengzhong LiuEric P. XingZhiting HuPublished in: CoRR (2024)
Keyphrases
- world model
- semantic interpretation
- natural language
- special case
- multimedia
- video data
- vision system
- real time
- video content
- video streams
- human actions
- video sequences
- conceptual representation
- perceptual aliasing
- state transitions
- initial state
- decision theoretic
- video analysis
- human activities
- general purpose
- knowledge representation
- data analysis
- image processing