WESPER: Zero-shot and Realtime Whisper to Normal Voice Conversion for Whisper-based Speech Interactions.

Published in: CHI (2023)

Keyphrases

text to speech
real time
emotion recognition
speech synthesis
speech recognition
speech signal
speech recognition errors
speech quality
speech sounds
voice activity detection
automatic speech recognition
dialogue system
fundamental frequency
prosodic features
spoken dialogue systems
endpoint detection
synthesized speech
automatic speech recognition systems
spoken document retrieval
speaker recognition
graphics hardware
linear prediction
real time systems
multi modal
probabilistic model
natural language
pattern recognition
social networks