End-to-end audiovisual speech activity detection with bimodal recurrent neural models.
Fei TaoCarlos BussoPublished in: Speech Commun. (2019)
Keyphrases
- end to end
- neural models
- spiking neural networks
- smart room
- recurrent neural networks
- speaker diarization
- biologically inspired
- neural model
- feed forward
- neural network model
- neural network
- artificial neural networks
- bio inspired
- learning rules
- biologically plausible
- video retrieval
- visual information
- multimedia content
- multi modal
- congestion control
- speech recognition
- audio visual
- training algorithm
- word error rate
- radial basis function