Login / Signup
COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning.
Simon Ging
Mohammadreza Zolfaghari
Hamed Pirsiavash
Thomas Brox
Published in:
NeurIPS (2020)
Keyphrases
</>
learning algorithm
prior knowledge
text representation
information retrieval
multimedia
concept learning
object recognition
digital libraries
image classification
text classification
data fusion