Croissant: A Metadata Format for ML-Ready Datasets.
Mubashara AkhtarOmar BenjellounCostanza ConfortiPieter GijsbersJoan Giner-MiguelezNitisha JainMichael KuchnikQuentin LhoestPierre MarcenacManil MaskeyPeter MattsonLuis OalaPierre RuyssenRajat ShindeElena SimperlGoeffry ThomasSlava TykhonovJoaquin VanschorenJos van der VeldeSteffen VoglerCarole-Jean WuPublished in: DEEM@SIGMOD (2024)
Keyphrases
- metadata
- digital libraries
- multimedia
- maximum likelihood
- database
- data sets
- databases
- dublin core
- learning objects
- benchmark datasets
- metadata management
- search tools
- information resources
- topic maps
- xml schema
- heterogeneous data
- semantic search
- web services
- data mining
- experimental results on real world
- metadata creation