Register identification from the unrestricted open Web using the Corpus of Online Registers of English.
Veronika LaippalaSamuel RönnqvistMiika OinonenAki-Juhani KyröläinenAnna SalmelaDouglas BiberJesse EgbertSampo PyysaloPublished in: Lang. Resour. Evaluation (2023)
Keyphrases
- cross language
- language model
- question answering
- cross lingual
- statistical machine translation
- document collections
- website
- online communication
- online communities
- online resources
- semantic web
- web applications
- internet users
- parallel corpus
- online learning
- web documents
- end users
- web pages
- linked data
- open domain
- web mining
- natural language
- internet enabled
- machine translation system
- web users
- web content
- information sources