Mining Parallel Texts from Mixed-Language Web Pages.
Masao UtiyamaDaisuke KawaharaKeiji YasudaEiichiro SumitaPublished in: MTSummit (2009)
Keyphrases
- web pages
- parallel texts
- machine translation system
- web logs
- web documents
- parallel corpus
- data mining
- search engine
- knowledge discovery
- bilingual dictionaries
- keywords
- web search
- text mining
- cross language information retrieval
- natural language
- web search engines
- web mining
- web data
- ground truth
- target language
- computational linguistics
- statistical machine translation
- lexico syntactic