Login / Signup
Multimodal Turn-Taking Model Using Visual Cues for End-of-Utterance Prediction in Spoken Dialogue Systems.
Fuma Kurata
Mao Saeki
Shinya Fujie
Yoichi Matsuyama
Published in:
INTERSPEECH (2023)
Keyphrases
</>
visual cues
multimedia
computer vision
visual features
visual information
dialogue system