Improving Self-Supervised Speech Representations by Disentangling Speakers.
Kaizhi QianYang ZhangHeting GaoJunrui NiCheng-I LaiDavid D. CoxMark Hasegawa-JohnsonShiyu ChangPublished in: CoRR (2022)
Keyphrases
- speech recognition
- speaker dependent
- speech signal
- automatic speech recognition
- speech synthesis
- higher level
- speech processing
- information systems
- recognition engine
- spoken dialogue systems
- speaker adaptation
- speaker recognition
- speaker identification
- multiple representations
- neural network
- text to speech
- language acquisition
- multi modal
- probabilistic model
- website
- endpoint detection
- vowel phonemes