Sign in

DiffS2UT: A Semantic Preserving Diffusion Model for Textless Direct Speech-to-Speech Translation.

Yongxin ZhuZhujin GaoXinyuan ZhouZhongyi YeLinli Xu
Published in: CoRR (2023)
Keyphrases
  • diffusion model
  • speech recognition
  • multiscale
  • speech signal
  • information systems
  • moving objects
  • social media
  • image data
  • diffusion process