PronScribe: Highly Accurate Multimodal Phonemic Transcription From Speech and Text.
Yang YuMatthew PerezAnkur BapnaFadi HaikSiamak TazariYu ZhangPublished in: INTERSPEECH (2023)
Keyphrases
- highly accurate
- text to speech
- text to speech synthesis
- audio visual
- automatic transcription
- capable of producing
- english text
- multi lingual
- lexical features
- text recognition
- multimodal interfaces
- multimodal interaction
- high quality
- spontaneous speech
- speech recognition technology
- text input
- speech recognition systems
- accurate models
- information retrieval
- free text
- multi modal
- high accuracy
- text retrieval
- text mining
- search engine
- multimedia
- speech sounds
- keywords
- spoken documents
- human computer interaction
- speech synthesis
- speech recognition
- handwriting recognition
- natural language
- conversational speech
- text documents
- speech signal