Login / Signup

MSDWild: Multi-modal Speaker Diarization Dataset in the Wild.

Tao LiuShuai FanXu XiangHongbo SongShaoxiong LinJiaqi SunTianyuan HanSiyuan ChenBinwei YaoSen LiuYifei WuYanmin QianKai Yu
Published in: INTERSPEECH (2022)
Keyphrases
  • multi modal
  • speaker diarization
  • audio visual
  • high dimensional
  • multi modality
  • machine learning
  • speech recognition
  • image annotation
  • feature extraction
  • semantic concepts
  • cross modal
  • pattern recognition
  • uni modal