Joint Training or Not: An Exploration of Pre-trained Speech Models in Audio-Visual Speaker Diarization.

Published in: CoRR (2023)

Keyphrases