WMT2016: A Hybrid Approach to Bilingual Document Alignment.
Sainik Kumar MahataDipankar DasSantanu PalPublished in: WMT (2016)
Keyphrases
- word alignment
- document classification
- word level
- document images
- machine translation
- sentence pairs
- information retrieval systems
- document collections
- retrieval systems
- information retrieval
- text documents
- parallel texts
- source language
- image alignment
- vector space model
- cross lingual
- document clustering
- relevant documents
- cross language information retrieval
- parallel corpora
- structured documents
- document analysis
- cross language
- comparable corpora
- language independent
- tf idf
- dynamic time warping
- text classification
- co occurrence