Effect of Attention and Self-Supervised Speech Embeddings on Non-Semantic Speech Tasks.
Payal MohapatraAkash PandeyYueyuan SuiQi ZhuPublished in: ACM Multimedia (2023)
Keyphrases
- speech recognition
- speech signal
- speech synthesis
- audio visual
- broadcast news
- high level
- endpoint detection
- autistic children
- text to speech
- natural language
- hearing impaired
- recognition engine
- speaker recognition
- emotion recognition
- automatic speech recognition
- spoken language
- natural language understanding
- speaker identification
- noisy environments
- focus of attention
- dialogue system
- audio stream
- semantic analysis
- distance measure