Facetron: Multi-speaker Face-to-Speech Model based on Cross-modal Latent Representations.

Published in: CoRR (2021)

Keyphrases