• search
    search
  • reviewers
    reviewers
  • feeds
    feeds
  • assignments
    assignments
  • settings
  • logout

Learning Fine-Grained Visual Understanding for Video Question Answering via Decoupling Spatial-Temporal Modeling.

Hsin-Ying LeeHung-Ting SuBing-Chen TsaiTsung-Han WuJia-Fong YehWinston H. Hsu
Published in: CoRR (2022)
Keyphrases