Login / Signup
Rule-Embedded Network for Audio-Visual Voice Activity Detection in Live Musical Video Streams.
Yuanbo Hou
Yi Deng
Bilei Zhu
Zejun Ma
Dick Botteldooren
Published in:
ICASSP (2021)
Keyphrases
</>
video streams
audio visual
video data
multi modal
visual data
visual information
voice activity detection
video content
video summarization
multi stream
audio visual speech recognition
multimedia
video frames
sports video
feature space
multiscale
noisy environments
sound source