WFST-based structural classification integrating dnn acoustic features and RNN language features for speech recognition.
Quoc Truong DoSatoshi NakamuraMarc DelcroixTakaaki HoriPublished in: ICASSP (2015)
Keyphrases
- speech recognition
- mel frequency cepstral coefficients
- acoustic features
- speech signal
- speaker identification
- automatic speech recognition
- speech recognition systems
- feature vectors
- pattern recognition
- feature set
- speaker recognition
- feature extraction
- classification accuracy
- cepstral coefficients
- feature space
- speaker diarization
- hidden markov models
- language model
- extracting features
- gaussian mixture model
- audio features
- speaker verification
- extracted features
- training process
- spectral features
- noisy environments
- visual features
- low level
- feature selection
- neural network
- machine learning
- principal component analysis
- natural language processing
- broadcast news
- structural features
- signal to noise ratio
- text classification
- support vector machine