Revisiting audio visual scene-aware dialog.
Aishan LiuHuiyuan XieXianglong LiuZixin YinShunchang LiuPublished in: Neurocomputing (2022)
Keyphrases
- audio visual
- visual data
- video scene
- multi modal
- visual information
- multi stream
- person authentication
- video summarization
- image sequences
- video sequences
- temporal context
- multimedia
- audio visual speech recognition
- input image
- three dimensional
- emotion recognition
- high dimensional data
- visual features
- user interface
- image data
- natural language
- eye movements
- human computer interaction
- co occurrence
- low level
- high dimensional