IBM—IBM scraped 1 million Flickr photos without consent to build facial recognition diversity dataset
In January 2019, IBM released the 'Diversity in Faces' dataset containing approximately 1 million images scraped from Flickr without the knowledge or consent of the photographers or their subjects. While intended to address racial bias in facial recognition by creating more diverse training data, the dataset was built without any notification to the people whose faces were included. NBC News revealed the lack of consent in March 2019.
Scoring Impact
| Topic | Direction | Relevance | Contribution |
|---|---|---|---|
| Intellectual Property Ethics | -against | secondary | -0.50 |
| User Privacy | -against | primary | -1.00 |
| Overall incident score = | -0.429 | ||
Score = avg(topic contributions) × significance (medium ×1) × confidence (0.57)
Evidence (1 signal)
NBC News revealed IBM scraped Flickr photos without consent for facial recognition dataset
NBC News reporter Olivia Solon revealed in March 2019 that IBM's 'Diversity in Faces' dataset, containing approximately 1 million images intended to address bias in facial recognition, was scraped from Flickr without notifying or obtaining consent from photographers or their subjects.