Login / Signup
OmAgent: A Multi-modal Agent Framework for Complex Video Understanding with Task Divide-and-Conquer.
Lu Zhang
Tiancheng Zhao
Heting Ying
Yibo Ma
Kyusong Lee
Published in:
CoRR (2024)
Keyphrases
</>
multi modal
multiple modalities
semantic concepts
video search
multi modality
multi agent
cross modal
multimedia
multi agent systems
image annotation
audio visual
high dimensional
higher level
video data
spatial and temporal
video database