People in Photo Albums (PIPA) is a dataset of photos used for face recognition. The dataset was published in 2015 and contains 60,000 face images of about 2,000 individuals, of which 32,518 photos were taken from Flick.com.
According to the dataset authors, PIPA was designed to help recognize peoples' identities in photo albums in an unconstrained setting. But face recognition has applications far beyond personal photo album processing. And sharing a dataset of face images for building face analysis tools contributes to unexpected applications. For example, in 2018 researchers from a military research university in China used the PIPA dataset for their research on "Understanding Humans in Crowded Scenes". The dataset was also used by researchers affiliated with the surveillance company SenseTime and the American surveillance company Facebook.
The personal nature of the dataset, that it includes primarily images of people's semi-public photos shared online, means that it contains many images of children, family dinners, weddings, and other photos are personal in nature. As of January 2020, Berkeley is not longer distributing the dataset though Max Planck Institut in Germany still provides it for free and unrestricted download at https://www.mpi-inf.mpg.de/departments/computer-vision-and-machine-learning/research/people-detection-pose-estimation-and-tracking/person-recognition-in-personal-photo-collections.
The charts below show an analysis of the most frequent image tags that were used for the Flickr images in the PIPA dataset. Thousands of images include tags for #DoD (Department of Defense), #Military, and #ArmedForces.
To help understand how PIPA Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing People in Photo Albums Dataset was collected, verified, and geocoded to show how AI training data has proliferated around the world. Click on the markers to reveal research projects at that location.
If you reference or use any data from the Exposing.ai project, cite our original research as follows:
@online{Exposing.ai, author = {Harvey, Adam. LaPlace, Jules.}, title = {Exposing.ai}, year = 2021, url = {https://exposing.ai}, urldate = {2021-01-01} }
If you reference or use any data from PIPA cite the author's work:
@article{Zhang2015BeyondFF, author = "Zhang, Ning and Paluri, Manohar and Taigman, Yaniv and Fergus, Rob and Bourdev, Lubomir D.", title = "Beyond frontal faces: Improving Person Recognition using multiple cues", journal = "2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)", year = "2015", pages = "4804-4813" }