Using Similarity Measures to Select Pretraining Data for NER.
Xiang DaiSarvnaz KarimiBen HacheyCécile ParisPublished in: CoRR (2019)
Keyphrases
- data sets
- raw data
- data structure
- similarity measure
- data analysis
- data collection
- data processing
- small number
- data distribution
- computer systems
- database
- data sources
- data mining techniques
- synthetic data
- original data
- named entity recognition
- knowledge discovery
- information extraction
- probabilistic model
- natural language processing
- markov random field
- data model
- high quality
- spatial data
- conditional random fields
- information retrieval
- databases