MelGAN-VC: Voice Conversion and Audio Style Transfer on arbitrarily long samples using Spectrograms.
Marco PasiniPublished in: CoRR (2019)
Keyphrases
- emotion recognition
- text to speech
- data sets
- multimedia
- training samples
- audio visual
- prosodic features
- transfer learning
- knowledge transfer
- audio video
- digital video
- voice activity detection
- sampling methods
- data samples
- signal processing
- neural network
- visual features
- training set
- audio signals
- reinforcement learning
- metadata
- voice and data services