Multi-timescale Feature-extraction Architecture of Deep Neural Networks for Acoustic Model Training from Raw Speech Signal.
Ryu TakedaKazuhiro NakadaiKazunori KomataniPublished in: IROS (2018)
Keyphrases
- speech signal
- neural network
- feature extraction
- speaker identification
- speech recognition
- training process
- pattern recognition
- cepstral coefficients
- automatic speech recognition
- spectral analysis
- mel frequency cepstral coefficients
- background noise
- linear prediction
- speech quality
- vocal tract
- automatic speech recognition systems
- non stationary
- artificial neural networks
- back propagation
- hearing aids
- feature selection
- image processing
- pattern classification
- feature set
- noisy environments
- training set
- hidden markov models
- face recognition
- training data
- speaker recognition
- neural network model
- feature vectors
- natural images
- wavelet transform
- spectral features
- principal component analysis
- extracted features
- image classification
- frequency domain
- information retrieval