STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation.
Qingkai FangRong YeLei LiYang FengMingxuan WangPublished in: CoRR (2022)
Keyphrases
- text to speech
- text to speech synthesis
- speech recognition
- automatic speech recognition
- speech signal
- dialogue system
- spontaneous speech
- speech synthesis
- english text
- text recognition
- language generation
- audio visual
- lexical features
- information retrieval
- text input
- multi lingual
- spoken language
- noisy environments
- text mining
- speaker identification
- natural language generation
- geometric structure
- synthesized speech
- database