Login / Signup

CapFormer: A Space-Time Video Description Model using Joint-Attention Transformer.

Mahamat MoussaChern Hong LimKokSheik Wong
Published in: APSIPA ASC (2023)
Keyphrases
  • space time
  • video sequences
  • spatio temporal
  • spatial and temporal
  • video analysis
  • temporal domain
  • visual features
  • motion model
  • dynamic scenes
  • video annotation
  • video representation
  • input video