Speech-to-Lip Movement Synthesis by Maximizing Audio-Visual Joint Probability Based on the EM Algorithm.
Satoshi NakamuraEli YamamotoPublished in: J. VLSI Signal Process. (2001)
Keyphrases
- audio visual
- em algorithm
- joint probability
- audio visual speech recognition
- expectation maximization
- multi modal
- multi stream
- mixture model
- maximum likelihood
- generative model
- visual information
- multimedia
- gaussian mixture model
- visual data
- hyperparameters
- parameter estimation
- maximum a posteriori
- audio features
- data sets
- machine learning
- conditional probabilities
- pairwise
- training data
- unsupervised learning
- low level