Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis.
Chaoyou FuYuhan DaiYondong LuoLei LiShuhuai RenRenrui ZhangZihan WangChenyu ZhouYunhang ShenMengdan ZhangPeixian ChenYanwei LiShaohui LinSirui ZhaoKe LiTong XuXiawu ZhengEnhong ChenRongrong JiXing SunPublished in: CoRR (2024)
Keyphrases
- video analysis
- multi modal
- comprehensive evaluation
- video data
- video processing
- video content analysis
- event detection
- event recognition
- semantic concepts
- video analytics
- video segmentation
- video annotation
- video indexing
- shot boundary detection
- video database
- object detection and tracking
- audio visual
- video streams
- multi modality
- sports video
- cross modal
- surveillance videos
- video search
- soccer video
- video shots
- video content
- video scene
- high dimensional
- video indexing and retrieval
- uni modal
- video sequences
- multiple modalities
- image annotation
- semantic video
- image classification
- multimedia