Video-driven speaker-listener generation based on Transformer and neural renderer.
Daowu YangQi YangWen JiangJifeng ChenZhengxi ShaoQiong LiuPublished in: Multim. Tools Appl. (2024)
Keyphrases
- video data
- multimedia
- video streams
- neural network
- video content
- video sequences
- real time
- digital video
- video clips
- real time video
- generation process
- video database
- network architecture
- temporal information
- multimedia data
- speech recognition
- space time
- data driven
- neural model
- genetic algorithm
- spatial and temporal
- video segmentation
- video processing
- spatio temporal