Semantic and Relation Modulation for Audio-Visual Event Localization.
Hao WangZheng-Jun ZhaLiang LiXuejin ChenJiebo LuoPublished in: IEEE Trans. Pattern Anal. Mach. Intell. (2023)
Keyphrases
- audio visual
- multi modal
- audio visual content
- sports video
- visual information
- visual data
- multi stream
- person authentication
- semantic information
- video summarization
- temporal context
- multimedia
- natural language
- audio visual speech recognition
- event detection
- semantic search
- high level
- contextual information
- data management
- training set