Exploring convolutional, recurrent, and hybrid deep neural networks for speech and music detection in a large audio dataset.
Diego de Benito-GorrónAlicia Lozano-DiezDoroteo T. ToledanoJoaquin Gonzalez-RodriguezPublished in: EURASIP J. Audio Speech Music. Process. (2019)
Keyphrases
- audio signals
- neural network
- audio features
- speech music discrimination
- audio recordings
- digital audio
- audio visual
- recurrent neural networks
- music information retrieval
- voice activity detection
- music genre classification
- audio stream
- speaker identification
- broadcast news
- feature set
- acoustic features
- pattern recognition
- music scores
- audio signal
- feed forward
- speech recognition
- detection algorithm
- noisy environments
- deep learning
- artificial neural networks
- gaussian mixture model
- text to speech
- object detection
- music score
- network architecture
- speech synthesis
- genre classification
- genetic algorithm
- soccer video
- signal processing
- fuzzy logic
- hidden markov models
- music retrieval
- multi stream
- restricted boltzmann machine
- speech signal
- acoustic signals
- low level
- unsupervised feature learning
- multimedia
- learning algorithm
- automatic music genre classification