Standardizing linguistic data: method and tools for annotating (pre-orthographic) French.
Simon GabayThibault ClériceJean-Baptiste CampsJean-Baptiste TanguyMatthias Gille LevensonPublished in: CoRR (2020)
Keyphrases
- synthetic data
- data sets
- input data
- statistical methods
- database
- high accuracy
- prior knowledge
- test data
- noisy data
- original data
- training samples
- preprocessing
- prior information
- information loss
- high dimensional data
- clustering method
- data mining techniques
- user input
- support vector machine
- objective function
- data points
- data analysis
- pairwise
- similarity measure
- feature set
- data structure
- computational complexity
- raw data
- cost function
- missing data
- data sources
- end users
- detection method
- knowledge discovery
- data collection
- image data