Sign in

Multi-Granularity Aggregation Transformer for Joint Video-Audio-Text Representation Learning.

Mengge HeWenjing DuZhiquan WenQing DuYutong XieQi Wu
Published in: IEEE Trans. Circuits Syst. Video Technol. (2023)
Keyphrases
  • multimedia
  • multi granularity
  • learning algorithm
  • high dimensional
  • supervised learning
  • digital libraries
  • databases
  • feature extraction
  • text representation