Deep Visual Forced Alignment: Learning to Align Transcription with Talking Face Video.
Minsu KimChae Won KimYong Man RoPublished in: CoRR (2023)
Keyphrases
- learning process
- prior knowledge
- real time
- learning systems
- visual information
- unsupervised learning
- video frames
- online learning
- video streams
- learning tasks
- low level
- video sequences
- visual data
- video search
- deep learning
- visual learning
- human faces
- space time
- video data
- knowledge acquisition
- face images
- supervised learning
- learning environment
- image sequences
- multimedia
- computer vision