Login / Signup
SlideSpeech: A Large-Scale Slide-Enriched Audio-Visual Corpus.
Haoxu Wang
Fan Yu
Xian Shi
Yuezhang Wang
Shiliang Zhang
Ming Li
Published in:
CoRR (2023)
Keyphrases
</>
audio visual
multi modal
visual information
visual data
multi stream
person authentication
emotion recognition
multimedia
audio visual speech recognition
temporal context
multimodal fusion
video summarization
hidden markov models
spatio temporal
image sequences
data processing
low level
domain knowledge