Login / Signup

Learning Fine-Grained Visual Understanding for Video Question Answering via Decoupling Spatial-Temporal Modeling.

Hsin-Ying LeeHung-Ting SuBing-Chen TsaiTsung-Han WuJia-Fong YehWinston H. Hsu
Published in: CoRR (2022)
Keyphrases