End-to-end audiovisual speech activity detection with bimodal recurrent neural models.

Fei Tao Carlos Busso

Published in: Speech Commun. (2019)

Keyphrases

end to end
neural models
spiking neural networks
smart room
recurrent neural networks
speaker diarization
biologically inspired
neural model
feed forward
neural network model
neural network
artificial neural networks
bio inspired
learning rules
biologically plausible
video retrieval
visual information
multimedia content
multi modal
congestion control
speech recognition
audio visual
training algorithm
word error rate
radial basis function