A High-Quality Speech and Audio Codec With Less Than 10-ms Delay.
Jean-Marc ValinTimothy B. TerriberryChristopher MontgomeryGregory MaxwellPublished in: IEEE Trans. Speech Audio Process. (2010)
Keyphrases
- high quality
- audio stream
- audio visual
- broadcast news
- audio signals
- speaker identification
- cepstral features
- text to speech
- audio features
- emotion recognition
- speech segments
- audio recordings
- digital audio
- speech music discrimination
- prosodic features
- automatic transcription
- speech signal
- multimedia
- acoustic signals
- video coding
- speech recognition
- speech processing
- audio video
- linear predictive coding
- multi modal
- spoken documents
- video codec
- visual information
- visual speech
- speech synthesis
- signal processing
- video streams
- low quality
- language acquisition
- automatic speech recognition
- image quality
- high resolution
- motion estimation
- pac man
- neural network
- human language
- multi stream
- video data
- spoken language
- noisy environments
- spontaneous speech
- speaker diarization
- facial expressions
- multimodal interfaces
- acoustic features
- global exponential stability
- visual features
- digital video
- bitstream