Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based Variable-Length Embedding.
Seungwoo ChoiSeungju HanDongyoung KimSungjoo HaPublished in: CoRR (2020)
Keyphrases
- variable length
- text to speech
- fixed length
- speech synthesis
- n gram
- bitstream
- prosodic features
- word processing
- video sequences
- text compression
- run length encoding
- programming tool
- statistical dependencies
- text to speech synthesis
- human motion
- writing skills
- video content
- key frames
- computer vision
- visual attention
- video data
- visual features
- knn
- feature vectors
- high quality