Login / Signup
VoViT: Low Latency Graph-based Audio-Visual Voice Separation Transformer.
Juan F. Montesinos
Venkatesh S. Kadandale
Gloria Haro
Published in:
CoRR (2022)
Keyphrases
</>
audio visual
low latency
emotion recognition
multi modal
sound source
high speed
real time
high throughput
visual information
virtual machine
highly efficient
visual data
multimedia
multi stream
stream processing
databases
data acquisition
database
text classification
low cost
audio visual speech recognition