Understanding Shared Speech-Text Representations.

Gary Wang Kyle Kastner Ankur Bapna Zhehuai Chen Andrew Rosenberg Bhuvana Ramabhadran Yu Zhang

Published in: CoRR (2023)

Keyphrases

text to speech synthesis
text to speech
speech recognition
text recognition
external representations
text input
semantic representations
information retrieval
english text
text retrieval
multi lingual
web documents
keywords
database
natural language generation
conversational speech
speech signal
language generation
text data
text documents
deeper understanding
automatic speech recognition
audio visual
automatically discovering
semantic information
multi modal
hidden markov models
digital libraries
machine learning