Audio Input Generates Continuous Frames to Synthesize Facial Video Using Generative Adiversarial Networks.
Hanhaodi ZhangPublished in: CoRR (2022)
Keyphrases
- video frames
- multimedia
- audio video
- key frames
- video signals
- video content analysis
- video data
- digital video
- multimedia processing
- shot boundary detection
- video sequences
- video content
- emotion recognition
- scene change detection
- video scene
- single frame
- successive frames
- temporal coherence
- video analysis
- audio files
- visual data
- temporal filtering
- video clips
- generative model
- video images
- multimedia information
- motion features
- video files
- long video
- image frames
- video streams
- video segments
- network structure
- lecture videos
- video material
- soccer video
- audio visual
- audio stream
- social networks
- video database
- audio features
- visual information
- human faces
- story segmentation
- digital audio
- facial expressions
- audio signals
- input video
- facial images
- video objects
- video retrieval
- frame rate
- multi frame
- moving objects
- audio visual content
- multimedia data
- video recordings
- broadcast news
- facial expression recognition
- video shots
- audio content
- video summarization
- image sequences
- objects in video sequences