VCSE: Time-Domain Visual-Contextual Speaker Extraction Network.

Junjie Li Meng Ge Zexu Pan Longbiao Wang Jianwu Dang

Published in: CoRR (2022)

Keyphrases

network structure
network model
visual information
information extraction
network traffic
speech recognition
visual perception
communication networks
contextual information
complex networks
network resources
computer networks
low level
frequency domain
multi modal
wireless sensor networks
context sensitive
link prediction
image retrieval
visual data
automatically extracted
network management
neural network
network design
real time