Narrowband perceptual audio coding: enhancements for speech.
Hossein Najaf-ZadehPeter KabalPublished in: INTERSPEECH (2001)
Keyphrases
- speech signal
- audio stream
- speaker identification
- linear prediction
- audio visual
- linear predictive coding
- speech recognition
- broadcast news
- audio signals
- linear predictive
- cepstral features
- audio features
- digital audio
- acoustic features
- emotion recognition
- mel frequency cepstral coefficients
- coding scheme
- speech processing
- text to speech
- automatic speech recognition
- cross modal
- low level
- human visual system
- prosodic features
- non stationary
- multimedia
- speech music discrimination
- speech synthesis
- automatic transcription
- audio recordings
- acoustic signals
- multi stream
- multi modal
- signal processing
- noisy environments
- audio video
- visual information
- visual data
- cepstral coefficients
- human perception
- spoken documents
- speaker recognition
- video signals
- video streams
- content based video retrieval
- human language
- speaker diarization
- image coding
- gaussian mixture model
- audio signal