HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips.
Antoine MiechDimitri ZhukovJean-Baptiste AlayracMakarand TapaswiIvan LaptevJosef SivicPublished in: CoRR (2019)
Keyphrases
- video clips
- m learning
- tv shows
- video segments
- video database
- mobile learning
- closed captions
- video collections
- video data
- video frames
- video content
- key frames
- mobile phone
- e learning
- multimedia documents
- mobile applications
- video streams
- video retrieval
- mobile devices
- context aware
- higher education
- learning experience
- video shots
- long video
- distance education
- video dataset
- topic segmentation
- keywords
- video material
- learning systems
- learning technologies
- video sequences
- news video
- information retrieval
- video classification
- tv programs
- user experience
- temporal information
- human interactions
- video scene
- database
- databases