KIM research fellow Adam Harvey launches the engine to trace the use of personal images in training face recognition technologies.


Check → if your Flickr photos were used
to build face recognition models is a new online tool which lets people search image datasets for face recognition through their username to see if photos they uploaded in the past to their Flickr accounts have been exploited to improve face recognition. The project was launched on 31 January 2021 and is based on years of research about image training datasets used for face recognition and related biometric analysis technologies. After tracking down and analyzing hundreds of these datasets a pattern emerged: millions of images were being downloaded from where permissive content licenses are encouraged and biometric data is abundant. Telling the complex story of how yesterday’s photographs became today’s training data is part of the goal of this project.

At the moment, the project includes six datasets that were used for building, benchmarking, or enhancing face recognition. is largely based on Megapixels, a project investigating the ethics, origins, and individual privacy implications of face recognition image training datasets and their role in the expansion of biometric surveillance technologies. Megapixels looked at the largest publicly available facial recognition dataset MegaFace which is comprised of 3.5 million images taken from and licensed under Creative Commons. Although its use aimed only at non-commercial research and educational purposes, MegaPixels uncovered that it had been exploited by companies throughout the world to advance their face recognition technologies.

During 2020, Harvey was a research fellow with KIM who supported the MegaPixels project with the means of a grant the research group had received from the Volkswagen Stiftung program AI and the Society of the Future. The research fellowship was used to map the information supply chains of datasets that used images from A forthcoming analysis provided insight into over 25 datasets that used Flickr images for AI computer vision research projects, many of which have applications to surveillance technologies. This research helped build the search engine.


Press coverage → The New York Times