Login / Signup
M³AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset.
Zhe Chen
Heyang Liu
Wenyi Yu
Guangzhi Sun
Hongcheng Liu
Ji Wu
Chao Zhang
Yu Wang
Yanfeng Wang
Published in:
ACL (1) (2024)
Keyphrases
</>
audio visual
multi modal
multimedia
visual information
video summarization
temporal context
multi stream
multimodal fusion
visual data
emotion recognition
data sets
audio features
multimedia data