Login / Signup
SlideSpeech: A Large Scale Slide-Enriched Audio-Visual Corpus.
Haoxu Wang
Fan Yu
Xian Shi
Yuezhang Wang
Shiliang Zhang
Ming Li
Published in:
ICASSP (2024)
Keyphrases
</>
audio visual
multi modal
visual information
visual data
multi stream
video summarization
multimedia
person authentication
emotion recognition
temporal context
multimodal fusion
audio visual speech recognition
text data