Token-level Speaker Change Detection Using Speaker Difference and Speech Content via Continuous Integrate-and-fire.
Zhiyun FanZhenlin LiangLinhao DongYi LiuShiyu ZhouMeng CaiJun ZhangZejun MaBo XuPublished in: INTERSPEECH (2022)
Keyphrases
- change detection
- speech recognition
- speaker recognition
- speaker verification
- automatic speech recognition
- audio visual
- speaker identification
- remote sensing
- speaker diarization
- speaker dependent
- remote sensing images
- vocal tract
- prosodic features
- remotely sensed images
- remote sensing imagery
- image registration
- satellite imagery
- satellite images
- automatic speech recognition systems
- speech signal
- gaussian mixture model
- land cover
- land cover change
- synthesized speech
- data streams
- remotely sensed
- speaker adaptation
- neural network
- multimedia
- speech synthesis
- hidden markov models
- automatic transcription
- language model
- feature extraction
- acoustic features
- image processing
- gamma distributions
- information content
- noisy environments
- text to speech
- image classification
- remotely sensed data
- man made structures
- speech sounds