Google Facial Expression Comparison (FEC) is a dataset of faces images taken from Flickr.com for the purpose of developing facial emotion analysis and search software. According to the authors, one of the main applications of this technology is for "expression-based image retrieval by using nearest neighbor search in the expression embedding space." 1 Their paper, "A Compact Embedding for Facial Expression Similarity" situates the FEC dataset within the context of larger facial recognition datasets including MS-Celeb-1M and MegaFace.
Although the Google FEC dataset has not been cited or abused nearly as much as MegaFace or MS-Celeb-1M it is still notable because the dataset includes 87,517 confirmed images taken from Flickr and used for the purpose of biometric analysis. Of the 87,517 unique photos, Exposing.ai found they belonged to 45,382 unique Flickr account holders and has made the data searchable on exposing.ai/search.
The method used to create the Google FEC dataset follows a long and problematic practice of scraping images, misinterpreting their license, and ignoring biometric laws that protect against this. For example, Illinois prohibits any company from selling or otherwise profiting from one's biometric information. In the FEC dataset, that biometric information is quantified as expression, which is then used for image querying, and with additional stated intentions of use for developing products. The authors also note they would imagine this technology will be useful for "photo album summarization". Given that all researchers are from Google and that the data was collected by Google employees, it follows that Google is profiting from use of the Flickr images.
The following emotion labels were conceived by the authors and used to label the 87,000+ images in the FEC dataset: Amusement, Anger, Awe, Boredom, Concentration, Confusion, Contemplation, Contempt, Contentment, Desire, Disappointment, Disgust, Distress, Doubt, Ecstasy, Elation, Embarrassment, Fear, Interest, Love, Neutral, Pain, Pride, Realization, Relief, Sadness, Shame, Surprise, Sympathy, Triumph.
The dataset is available to download without restriction at https://research.google/tools/datasets/google-facial-expression/. The data is limited to a CSV file with the URL of each Flickr photo containing the faces. The bounding boxes of each face are provided in the CSV file according to the following details:
Each line in the CSV files has the following entries:
- URL of image1 (string)
- Top-left column of the face bounding box in image1 normalized by width (float)
- Bottom-right column of the face bounding box in image1 normalized by width (float)
- Top-left row of the face bounding box in image1 normalized by height (float)
- Bottom-right row of the face bounding box in image1 normalized by height (float)
- URL of image2 (string)
- Top-left column of the face bounding box in image2 normalized by width (float)
- Bottom-right column of the face bounding box in image2 normalized by width (float)
- Top-left row of the face bounding box in image2 normalized by height (float)
- Bottom-right row of the face bounding box in image2 normalized by height (float)
- URL of image3 (string)
- Top-left column of the face bounding box in image3 normalized by width (float)
- Bottom-right column of the face bounding box in image3 normalized by width (float)
- Top-left row of the face bounding box in image3 normalized by height (float)
- Bottom-right row of the face bounding box in image3 normalized by height (float)
- Triplet_type (string) - A string indicating the variation of expressions in the triplet.
- Annotator1_id (string) - This is just a string of random numbers that can be used to
search for all the samples in the dataset annotated by a particular annotator.
- Annotation1 (integer)
- Annotator2_id (string)
- Annotation2 (integer)
Opting out is not a reasonable response to having your biometric data used in a commercial research data. However, you can email the researchers Raviteja Vemulapalli or Aseem Agarwala for any additional questions regarding the dataset. Their emails are listed in the research paper at https://arxiv.org/abs/1811.11283.
If you reference or use any data from the Exposing.ai project, cite our original research as follows:
@online{Exposing.ai, author = {Harvey, Adam. LaPlace, Jules.}, title = {Exposing.ai}, year = 2021, url = {https://exposing.ai}, urldate = {2021-01-01} }
If you reference or use any data from Google FEC cite the author's work:
@article{Vemulapalli2019ACE, author = "Vemulapalli, Raviteja and Agarwala, Aseem", title = "A Compact Embedding for Facial Expression Similarity", journal = "2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)", year = "2019", pages = "5676-5685" }