Pretraining Language- and Domain-Specific BERT on Automatically Translated Text.
Tatsuya IshigakiYui UeharaGoran TopicHiroya TakamuraPublished in: RANLP (2023)
Keyphrases
- domain specific
- language generation
- machine translation system
- english text
- controlled natural language
- target language
- general purpose
- text to speech synthesis
- computational linguistics
- domain independent
- human language
- programming language
- text mining
- source language
- automatically generated
- english language
- native language
- extraction rules
- information retrieval
- semi automatically
- natural language
- language learning
- text understanding
- semantic representation
- text to speech
- text documents
- text generation
- machine learning
- keywords
- automatically discovering
- natural language processing
- linguistic analysis
- text retrieval
- free text
- language specific
- semantic representations
- xml documents
- lexical information
- relational databases
- natural language generation
- syntactic categories
- specification language