Reusable Templates and Guides For Documenting Datasets and Models for Natural Language Processing and Generation: A Case Study of the HuggingFace and GEM Data and Model Cards.
Angelina McMillan-MajorSalomey OseiJuan Diego RodriguezPawan Sasanka AmmanamanchiSebastian GehrmannYacine JernitePublished in: CoRR (2021)
Keyphrases
- experimental data
- probabilistic model
- prior knowledge
- accurate models
- data sets
- test data
- predictive model
- subject specific
- statistical methods
- statistical model
- learned models
- simulation data
- probability distribution
- natural language processing
- input data
- empirical data
- learning models
- data analysis
- physical models
- hybrid model
- historical data
- raw data
- computational model
- database
- explanatory variables
- models built
- knowledge discovery
- modeling framework
- model selection
- statistical models
- data points
- neural network model
- training data
- data sources
- discrete data
- generation process
- prediction model
- classification models
- parameter estimation
- computational models
- data mining algorithms
- data mining techniques
- synthetic datasets
- large scale data sets
- parametric models
- machine learning
- temporal dependencies