Automatic Long Audio Alignment and Confidence Scoring for Conversational Arabic Speech.
Mohamed ElmahdyMark Hasegawa-JohnsonEiman MustafawiPublished in: LREC (2014)
Keyphrases
- speech music discrimination
- broadcast news
- speech corpus
- speech transcripts
- audio visual
- audio stream
- conversational speech
- emotion recognition
- speaker identification
- audio signals
- text to speech
- multi modal
- spoken language
- gaussian mixture model
- speech synthesis
- speech recognition
- audio features
- automatic speech recognition
- speech processing
- audio recordings
- multimedia
- language identification
- linear predictive coding
- conversational agent
- human communication
- speaker verification
- speaker diarization
- audio video
- dynamic time warping
- prosodic features
- fully automatic
- confidence measure