Login / Signup
VoViT: Low Latency Graph-Based Audio-Visual Voice Separation Transformer.
Juan F. Montesinos
Venkatesh S. Kadandale
Gloria Haro
Published in:
ECCV (37) (2022)
Keyphrases
</>
audio visual
low latency
emotion recognition
multi modal
sound source
high speed
high throughput
real time
highly efficient
visual information
virtual machine
multimedia
multi stream
visual data
audio visual speech recognition
hidden markov models
stream processing
computer vision
structured data