Publication: Is a Video worth n n Images? A Highly Efficient Approach to Transformer-based Video Question Answering.