A factor automaton approach for the forced alignment of long speech recordings.
Pedro J. MorenoChristopher AlbertiPublished in: ICASSP (2009)
Keyphrases
- audio visual
- spontaneous speech
- speech recognition
- audio recordings
- image alignment
- multi modal
- acoustic features
- speech signal
- text to speech
- finite state machines
- finite state automata
- dialogue system
- emotion recognition
- speaker recognition
- factor analysis
- speech synthesis
- probabilistic model
- non stationary
- vocal tract
- multimodal interfaces
- speaker verification
- human machine interaction
- spoken language
- automatic speech recognition