Audio-visual Voice Conversion Using Deep Canonical Correlation Analysis for Deep Bottleneck Features.
Satoshi TamuraKento HorioHajime EndoSatoru HayamizuTomoki TodaPublished in: INTERSPEECH (2018)
Keyphrases
- canonical correlation analysis
- audio visual
- multi modal
- person authentication
- emotion recognition
- audio features
- least squares
- low level
- multi stream
- feature vectors
- partial least squares
- feature set
- feature extraction
- visual information
- classification accuracy
- data sets
- co occurrence
- image features
- dimensionality reduction methods
- feature space
- face recognition
- neural network