The brWaC Corpus: A New Open Resource for Brazilian Portuguese.
Jorge A. Wagner FilhoRodrigo WilkensMarco IdiartAline VillavicencioPublished in: LREC (2018)
Keyphrases
- brazilian portuguese
- machine translation
- resource management
- resource allocation
- resource constraints
- web resources
- manually annotated
- text data
- data sets
- statistical machine translation
- supervised machine learning
- website
- information retrieval
- multiword
- newspaper articles
- open domain
- resource requirements
- annotated corpus
- machine learning