ST: Mix at Three Levels for Speech Translation.

Xuxin Cheng Qianqian Dong Fengpeng Yue Tom Ko Mingxuan Wang Yuexian Zou

Published in: ICASSP (2023)

Keyphrases

speech recognition
endpoint detection
speech signal
automatic speech recognition
recognition engine
text to speech
machine translation
query translation
levels of abstraction
lower levels
text to speech synthesis
real time
language resources
speech synthesis
broadcast news
spoken language
emotion recognition
cross language information retrieval
audio visual