Automatically Mining Parallel Corpora for Minority Languages from Web Pages.
Zede ZhuMiao LiLei ChenWeihui ZengPublished in: IALP (2012)
Keyphrases
- parallel corpora
- web pages
- language independent
- comparable corpora
- cross lingual
- machine translation
- cross language information retrieval
- labor intensive
- statistical machine translation
- bilingual dictionaries
- query translation
- machine translation system
- sentence pairs
- web logs
- cross language
- word pairs
- data mining
- search engine
- text mining
- keywords
- text retrieval
- wikipedia articles
- sentence level
- web mining
- web documents
- web search
- knowledge discovery
- web search engines
- link analysis
- web data