Login / Signup

CHAN: Cross-Modal Hybrid Attention Network for Temporal Language Grounding in Videos.

Wen WangLing ZhongGuang GaoMinhong WanJason Gu
Published in: ICME (2023)
Keyphrases
  • cross modal
  • multi modal
  • spatio temporal
  • multimedia retrieval
  • visual data
  • visual recognition
  • video data
  • perceptual information
  • similarity measure
  • video sequences
  • test collection
  • video frames