DINO-VITS: Data-Efficient Noise-Robust Zero-Shot Voice Cloning via Multi-Tasking with Self-Supervised Speaker Verification Loss.
Vikentii PankovValeria ProninaAlexander KuzminMaksim BorisovNikita UsoltsevXingshan ZengAlexander GolubkovNikolai ErmolenkoAleksandra ShirshovaYulia MatveevaPublished in: CoRR (2023)