The Stack: 3 TB of permissively licensed source code.
Denis KocetkovRaymond LiLoubna Ben AllalJia LiChenghao MouCarlos Muñoz FerrandisYacine JerniteMargaret MitchellSean HughesThomas WolfDzmitry BahdanauLeandro von WerraHarm de VriesPublished in: CoRR (2022)
Keyphrases
- source code
- open source
- software systems
- software maintenance
- open source software
- software projects
- software evolution
- high level
- static analysis
- source files
- plagiarism detection
- object oriented systems
- mining software repositories
- open source projects
- program comprehension
- impact analysis
- software artifacts
- code examples
- symbolic execution
- execution traces
- version control
- text files
- program slicing
- software repositories
- reverse engineer