Joint Autoregressive Modeling of End-to-End Multi-Talker Overlapped Speech Recognition and Utterance-level Timestamp Prediction.
Naoki MakishimaKeita SuzukiSatoshi SuzukiAtsushi AndoRyo MasumuraPublished in: INTERSPEECH (2023)
Keyphrases
- speech recognition
- end to end
- autoregressive
- moving average
- linear prediction
- hidden markov models
- speech signal
- language model
- non stationary
- speech synthesis
- automatic speech recognition
- speech recognizer
- random fields
- pattern recognition
- speaker independent
- spectral analysis
- speaker identification
- speech recognition systems
- image compression
- edge detection
- higher order
- wavelet transform
- multiscale
- image processing