Audio-driven Talking Face Video Generation with Natural Head Pose.
Ran YiZipeng YeJuyong ZhangHujun BaoYong-Jin LiuPublished in: CoRR (2020)
Keyphrases
- face pose
- multimedia
- audio video
- real time
- head motion
- scene change detection
- human head
- head pose estimation
- upper body
- visual focus of attention
- head orientation
- video content analysis
- digital video
- visual data
- facial expressions
- multimedia processing
- illumination variations
- facial features
- video streams
- face model
- video files
- face images
- video sequences
- audio stream
- audio features
- pose estimation
- audio visual
- frontal face
- pose variations
- video data
- video analysis
- face detection and tracking
- multimedia information
- human faces
- audio files
- multimodal fusion
- mouth region
- head movements
- head tracking
- digital audio
- audio signals
- video retrieval
- facial images
- video frames
- visual information
- audio signal
- facial pose
- human body
- gaze direction
- media streams
- face detection
- video content
- facial actions
- broadcast news
- active appearance models
- focus of attention
- d objects
- audio content
- video clips
- soccer video
- face tracking
- eye gaze