Grounding Spoken Words in Unlabeled Video.
Angie W. BoggustKartik AudhkhasiDhiraj JoshiDavid HarwathSamuel ThomasRogério Schmidt FerisDanny GutfreundYang ZhangAntonio TorralbaMichael PichenyJames R. GlassPublished in: CVPR Workshops (2019)
Keyphrases
- spoken words
- video data
- video sequences
- video content
- multimedia
- training data
- video streams
- active learning
- supervised learning
- video surveillance
- video frames
- video clips
- real time
- semi supervised learning
- labeled data
- video database
- space time
- multi modal
- hidden markov models
- video shots
- automatic speech recognition
- video search
- prior knowledge