A Dataset and an Approach for Identity Resolution of 38 Million Author IDs extracted from 2B Git Commits.
Tanner FryTapajit DeyAndrey KarnauchAudris MockusPublished in: MSR (2020)
Keyphrases
- automatically extracted
- intrusion detection system
- intrusion detection
- high resolution
- million images
- database
- benchmark datasets
- real world
- information systems
- real life
- feature vectors
- computer vision
- social networks
- genetic algorithm
- data types
- machine learning
- sampling rate
- tens of thousands
- sample images
- databases