MAESTRO: Matched Speech Text Representations through Modality Matching.
Zhehuai ChenYu ZhangAndrew RosenbergBhuvana RamabhadranPedro J. MorenoAnkur BapnaHeiga ZenPublished in: CoRR (2022)
Keyphrases
- text to speech
- text to speech synthesis
- string matching
- text recognition
- text input
- english text
- multi lingual
- speech recognition
- lexical features
- semantic representations
- approximate pattern matching
- text retrieval
- free text
- medical images
- multi modal
- image matching
- spoken documents
- text documents
- matching algorithm
- pattern matching
- information retrieval
- document analysis
- automatically discovering
- multiple modalities
- text mining
- speech signal
- higher level
- feature points
- spontaneous speech
- broadcast news
- database
- automatic speech recognition