Beyond the English Web: Zero-Shot Cross-Lingual and Lightweight Monolingual Classification of Registers.
Liina RepoValtteri SkantsiSamuel RönnqvistSaara HellströmMiika OinonenAnna SalmelaDouglas BiberJesse EgbertSampo PyysaloVeronika LaippalaPublished in: EACL (Student Research Workshop) (2021)
Keyphrases
- cross lingual
- lightweight
- text classification
- machine translation
- cross language
- cross lingual information retrieval
- language modeling
- parallel corpus
- language independent
- european languages
- event extraction
- query translation
- machine translation system
- mono lingual
- translation model
- machine learning
- statistical machine translation
- language specific
- language model
- cross language information retrieval
- word alignment
- monolingual retrieval
- news articles
- natural language
- parallel corpora
- supervised learning
- transfer learning
- word sense
- indian languages
- web pages
- feature selection
- machine learning algorithms
- text categorization
- knn
- web documents
- query expansion
- document clustering