Publication: Pyramidal Cross-Modal Transformer with Sustained Visual Guidance for Multi-Label Image Classification.