Graph-Based Video-Language Learning with Multi-Grained Audio-Visual Alignment.
Chenyang LyuWenxi LiTianbo JiLongyue WangLiting ZhouCathal GurrinLinyi YangYi YuYvette GrahamJennifer FosterPublished in: ACM Multimedia (2023)
Keyphrases
- language learning
- audio visual
- video summarization
- visual data
- multimedia
- multi modal
- audio features
- visual information
- audio visual content
- video data
- video sequences
- computer assisted language learning
- language acquisition
- multi stream
- foreign language
- english language
- mobile learning
- video content
- audio visual speech recognition
- mobile language learning
- multimedia data
- image sequences
- video frames
- language learners
- low level