Login / Signup

A Critical Analysis of the Largest Source for Generative AI Training Data: Common Crawl.

Stefan Baack
Published in: FAccT (2024)
Keyphrases
  • training data
  • data sets
  • artificial intelligence
  • data analysis
  • image analysis
  • domain knowledge
  • web search
  • statistical analysis
  • database
  • genetic algorithm
  • case based reasoning
  • quantitative analysis