Putting a Face to the Voice: Fusing Audio and Visual Signals Across a Video to Determine Speakers.

Published in: CoRR (2017)

Keyphrases