Login / Signup
The Right to Talk: An Audio-Visual Transformer Approach.
Thanh-Dat Truong
Chi Nhan Duong
The De Vu
Hoang Anh Pham
Bhiksha Raj
Ngan Le
Khoa Luu
Published in:
CoRR (2021)
Keyphrases
</>
audio visual
multi modal
visual information
visual data
multi stream
emotion recognition
person authentication
multimedia
video summarization
audio visual speech recognition
temporal context
data sets
pattern recognition
three dimensional
image database
image features
domain knowledge
multiscale
high level
databases