Generating Monolingual Dataset for Low Resource Language Bodo from old books using Google Keep.
Sanjib NarzaryMaharaj BrahmaMwnthai NarzaryGwmsrang MuchaharyPranav Kumar SinghApurbalal SenapatiSukumar NandiBidisha SomPublished in: LREC (2022)
Keyphrases
- target language
- european languages
- parallel corpus
- machine translation system
- source language
- machine translation
- domain specific
- cross lingual
- question answering
- bilingual dictionaries
- clef evaluation campaign
- website
- search engine
- street view
- feature set
- language learning
- programming language
- cross language
- database
- information retrieval
- multilingual retrieval
- language independent
- benchmark datasets
- natural language
- resource management
- word alignment
- named entities
- resource allocation
- multilingual information retrieval
- query expansion