The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI.
Shayne LongpreRobert MahariAnthony ChenNaana Obeng-MarnuDamien SileoWilliam BrannonNiklas MuennighoffNathan KhazamJad KabbaraKartik PerisetlaXinyi WuEnrico ShippoleKurt D. BollackerTongshuang WuLuis VillaSandy PentlandDeb RoySara HookerPublished in: CoRR (2023)