Login / Signup

Multimodal Turn-Taking Model Using Visual Cues for End-of-Utterance Prediction in Spoken Dialogue Systems.

Fuma KurataMao SaekiShinya FujieYoichi Matsuyama
Published in: INTERSPEECH (2023)
Keyphrases
  • visual cues
  • multimedia
  • computer vision
  • visual features
  • visual information
  • dialogue system