Exposing.ai uses information from publicly available face and biometric analysis image datasets to provide you with the ability to determine if your Flickr photos were used in AI surveillance research projects. Currently Exposing.ai only provides results for Flickr photos that were found in biometric image training or testing datasets. In the future, this may expand to include other origins.
The search engine checks if your photos were included in a dataset by referencing Flickr identifiers including username, NSID, or photo ID. Only if an exact match is found, the match results are displayed. No search data is shared or stored during this project. The Flickr images are then loaded directly from Flickr.com. Exposing.ai does not store a copy of the images.
You can search for your Flickr photos by entering your Flickr username, Flickr path alias, or Flickr photo id. You can also search for photos of yourself taken by others by using a #hashtag. Example search queries are:
Photo URL: https://www.flickr.com/photos/1234567890@N01/12345/
Flickr NSID: 1234567890@N01
Using hashtags to search can help locate a photo of yourself if you attended events that were photographed and posted to Flickr. For example, conferences often ask attendees to share images with a themed hashtag. When using hashtags to search, use all lowercase and no spaces (eg use "#mybirthdayparty"). Searching with tags is slower due to the amount of information being processed. Each photo can have dozens of tags resulting in millions more records to search. If your search results are too slow, please be patient and try again shortly.
No. The Exposing.ai search engine does not use face recognition nor any similar technologies. The search results are only based on text attributes of Flickr photos including the Flickr username, Flickr path alias, Flickr NSID, Flickr photo ID, or Flickr photo tag.
Exposing.ai does not provide any face recognition services. These technologies are not accurate enough to definitively locate yourself in a dataset. As this project aims to highlight, many face recognition or face analysis training datasets contain biased demographic representation in biased lighting, environmental, or capture conditions.
Both teams behind this project (Adam Harvey and Jules LaPlace, and S.T.O.P.) independently evaluated this technology for use in reverse searching datasets prior to collaborating and came to the same conclusion: face recognition comprises a flawed means of identification which cannot be relied upon for its intended purpose.
For anyone consider using face recognition as reverse search tool for finding yourself in a dataset, there are several factors to consider. First, many of the photos in public image training datasets were derived from the YFCC100 dataset with a capture range between 2004 and 2014. The age discrepancy between your appearance today and your appearance from 7 to 17 (or longer) years ago will degrade the performance of a face similarity search. The second consideration is that search results will be further degraded the less white and less male you are. This is because many facial recognition tools, especially open source face recognition is trained and benchmarked on datasets that are dangerously biased. One such dataset, Labeled Faces in the Wild (LFW), has even added disclaimers to their dataset that it by no means provides a quality reference for how well, or not, a face recognition tool works. Other datasets including VGG Face have posted similar disclaimers alerting researchers that their dataset is flawed and will likely result in flawed face recognition models. The third reason is that even with the best face recognition tools, the results are never definitive. In the best case, a facial recognition models can only show you people that look similar. This means that if you're wearing glasses you will likely just match other people wearing glasses. And if you smile you will likely match other people smiling. Even the author of the UCCS face recognition dataset admits that "since the technology isn't so advanced a face recognition software won't be able to remove all your pictures" if you want to be removed from their dataset. (source)
Unfortunately, it is not possible to remove yourself from existing copies of datasets that you may have been included in. This is one of the major problems with image datasets, that they can not be controlled after the initial distribution. Several datasets may allow you to request removal, and this can prevent your photos from being distributed further. Information about requesting removal or opting-out are being compiled and will soon be included below your search results.
In the meantime, you can sign up for a newsletter/announcement email from our partner STOP to be notified about site updates and important information related to the search tool. Sign up at https://stopspying.org/exposing-ai.
Yes. You can prevent your photos from being displayed on Exposing.ai, as well as anywhere online, by hiding, deleting, or making private all your Flickr photos. This requires that you login to your Flickr.com account and edit the specific photos, edit all photos, or delete your account. Immediately after changing the status of your images to private, hidden, or deleted your photos can no longer be loaded from Flickr.com and therefore can not be displayed in search results on Exposing.ai.
In case you still see your photos being displayed, be sure to clear your web browser's cache and then recheck. If a file was cached by your web browser this only means that it was being stored locally on your computer.
If your photos or a photo of you was found in a dataset listed on Exposing.ai, this means that photos were definitively included in the specified distribution of the dataset. However, each usage of the dataset must then be evaluated on a case-by-case basis to determine if a research or company used the unmodified dataset in their research or if the dataset was altered prior to their research.
If a large number of your photos were found in a dataset the total number of photos displayed in your results may be limited. The display limit is subject to change and may be removed in the future if, for example, a module can be added that allows a user to authenticate as the Flickr account holder.
Exposing.ai is based on MegaPixels.cc, which received initial support from Mozilla. Additional support was provided by research fellowships with the Karslruhe KIM (Critical Media) group, the Weizenbaum Institut, and Copenhagen Business School. Exposing.ai is currently building more tools and applying for continued support to keep developing this project independently.
There are several related projects that have looked into similar ideas. The artist Phillip Schmitt made a project called "Humans of AI" that highlights the contributions of Flickr images in the COCO dataset. Olivia Solon, Joe Murphy, and Jeremia Kimelman of NBC News provide a form to check if your Flickr username was included in the IBM Diversity in Faces dataset. The author of the FFHQ dataset also provides a simple form to check if your Flickr photo was included in their dataset. Previously Adam Harvey used face recognition to try and find your image in biometric training datasets https://ahprojects.com/megapixels-glassroom/ (this project ultimately did not provide the results expected and is deprecated). And Matthias Pitscher made "This Person Does Exist" to highlight the real people behind the fake faces on ThisPersonDoesNotExist.com. If there are others that were omitted, please send info to Adam Harvey on Keybase at https://keybase.io/exposing_ai.
What's different about Exposing.ai? This search tool is actually based on the failure of face recognition to provide accurate results. Exposing.ai (the new version of the MegaPixels.cc project) was initially designed to use face recognition to find your photos in datasets. However, this never worked as intended and other prototypes were built eventually leading to the conclusion that the best way to check if your images were used in AI surveillance projects is to search with absolute identifiers (eg Flickr username or photo URL). Taking this idea further, Exposing.ai combines metadata from related, overlapping, and sometimes obscure datasets collected during years of research to provide a unified and scalable search tool along with resources and context to help users understand and explore the topic further.
If you have further questions, contact Adam Harvey on Keybase at https://keybase.io/exposing_ai or the S.T.O.P. team at