STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation.

Qingkai Fang Rong Ye Lei Li Yang Feng Mingxuan Wang

Published in: CoRR (2022)

Keyphrases

text to speech
text to speech synthesis
speech recognition
automatic speech recognition
speech signal
dialogue system
spontaneous speech
speech synthesis
english text
text recognition
language generation
audio visual
lexical features
information retrieval
text input
multi lingual
spoken language
noisy environments
text mining
speaker identification
natural language generation
geometric structure
synthesized speech
database