Login / Signup
3M-TRANSFORMER: A Multi-Stage Multi-Stream Multimodal Transformer for Embodied Turn-Taking Prediction.
Mehdi Fatan
Emanuele Mincato
Dimitra Pintzou
Mariella Dimiccoli
Published in:
CoRR (2023)
Keyphrases
</>
multistage
multi stream
turn taking
audio visual
audio visual speech recognition
hidden markov models
lot sizing
single stage
dynamic programming
multi modal
multi party
reinforcement learning
social networks
computer vision
optimal policy
visual information