Integration of speech recognition, text-to-speech synthesis, and talker verification into a hands-free audio/image teleconferencing system (humanet).
David A. BerkleyJames L. FlanaganPublished in: ICSLP (1990)
Keyphrases
- speech recognition
- speaker identification
- input image
- speech signal
- speech processing
- speech recognition technology
- speech synthesis
- image classification
- background noise
- text to speech
- pattern recognition
- language model
- automatic speech recognition
- hidden markov models
- image segmentation
- text to speech synthesis
- visual data
- image retrieval
- noisy environments
- signal processing
- speech recognizer
- audio visual speech recognition
- feature points
- low level
- high resolution
- cepstral coefficients
- machine learning
- human computer interaction
- hands free
- image processing