Using Term Position Similarity and Language Modeling for Bilingual Document Alignment.
Thanh LeHoa Trong VuJonathan OberländerOndrej BojarPublished in: WMT (2016)
Keyphrases
- language modeling
- cross lingual
- language modeling framework
- document language models
- language model
- improvements in retrieval effectiveness
- word alignment
- relevance model
- document similarity
- term weighting schemes
- information retrieval
- term weighting
- retrieval model
- pseudo feedback
- document representation
- query terms
- query expansion
- comparable corpora
- document length
- term dependencies
- n gram
- probabilistic model
- translation model
- document retrieval
- language modeling approaches
- term weights
- language independent
- retrieval effectiveness
- term frequency
- cross language
- vector space model
- text classification
- information retrieval systems
- cross language information retrieval
- parallel corpus
- statistical machine translation
- word level
- tf idf
- distance measure
- multiword
- finite state transducers
- similarity measure
- web search
- ad hoc information retrieval
- machine translation
- sentence retrieval
- test collection
- semantic similarity
- pseudo relevance feedback
- retrieval systems
- relevant documents
- machine translation system
- word segmentation
- query translation
- similarity search
- smoothing methods
- document collections
- query specific
- document clustering
- text retrieval
- retrieved documents
- statistical translation models
- multimedia