Translatotron 2: High-quality direct speech-to-speech translation with voice preservation.
Ye JiaMichelle Tadmor RamanovichTal RemezRoi PomerantzPublished in: ICML (2022)
Keyphrases
- high quality
- text to speech
- speech recognition
- speech synthesis
- speech signal
- endpoint detection
- audio visual
- emotion recognition
- prosodic features
- recognition engine
- speech processing
- fundamental frequency
- speaker recognition
- synthesized speech
- spoken language
- speech quality
- speech recognition errors
- hands free
- automatic speech recognition systems
- spontaneous speech
- vocal tract
- english text
- spoken dialogue systems
- automatic speech recognition
- higher quality
- cross language
- multi modal
- natural language processing
- hidden markov models