Publication: Fine-Grained Features Alignment and Fusion for Text-Video Cross-Modal Retrieval.