Analyzing quality of crowd-sourced speech transcriptions of noisy audio for acoustic model adaptation.
Kartik AudhkhasiPanayiotis G. GeorgiouShrikanth S. NarayananPublished in: ICASSP (2012)
Keyphrases
- broadcast news
- crowd sourced
- automatic transcription
- spoken documents
- prosodic features
- acoustic features
- audio visual
- speaker identification
- audio stream
- automatic speech recognition
- spontaneous speech
- noisy environments
- visual speech
- emotion recognition
- speech recognition
- text to speech
- mel frequency cepstral coefficients
- speech sounds
- speaker diarization
- speaker verification
- speech signal
- crowd sourcing
- spoken document retrieval
- multimedia
- speech synthesis
- audio signal
- object retrieval
- image data
- hidden markov models