VCSE: Time-Domain Visual-Contextual Speaker Extraction Network.
Junjie LiMeng GeZexu PanLongbiao WangJianwu DangPublished in: CoRR (2022)
Keyphrases
- network structure
- network model
- visual information
- information extraction
- network traffic
- speech recognition
- visual perception
- communication networks
- contextual information
- complex networks
- network resources
- computer networks
- low level
- frequency domain
- multi modal
- wireless sensor networks
- context sensitive
- link prediction
- image retrieval
- visual data
- automatically extracted
- network management
- neural network
- network design
- real time