Seed Words Based Data Selection for Language Model Adaptation.
Roberto GretterMarco MatassoniDaniele FalavignaPublished in: ASLTRW@MTSummit (2021)
Keyphrases
- raw data
- prior knowledge
- data collection
- data sets
- training data
- data acquisition
- data analysis
- knowledge discovery
- complex data
- high quality
- data distribution
- xml documents
- input data
- noisy data
- synthetic data
- statistical analysis
- data points
- data sources
- website
- data processing
- programming language
- computer systems
- high dimensional data
- sensor data
- application domains
- relational databases
- natural language
- data structure
- data quality
- information retrieval