SpeakingFaces: A Large-Scale Multimodal Dataset of Voice Commands with Visual and Thermal Video Streams.
Madina AbdrakhmanovaAskat KuzdeuovSheikh JarjuYerbolat KhassanovMichael LewisHuseyin Atakan VarolPublished in: CoRR (2020)
Keyphrases
- video streams
- video data
- thermal images
- multimodal information
- video clips
- video content
- video frames
- eye contact
- multi modal
- databases
- visual information
- visual features
- infrared
- video representation
- multimedia
- detecting moving objects
- low level
- video sequences
- visual data
- multimodal interaction
- compressed video
- three dimensional
- database
- infrared camera