AI Algorithms Are Biased Against Skin With Yellow Hues
Xiang states that the pursuit of enhancing and expanding measures related to skin color will be ongoing. She emphasizes the importance of continuous efforts to make advancements in this area. Monk expresses his satisfaction with the increasing attention given to this topic, which had been previously neglected for a significant period of time. He believes that various measures could be valuable depending on the specific circumstances. Google spokesperson Brian Gabriel acknowledges the new research and confirms that the company is currently evaluating it.
The color of a person’s skin is determined by the interaction of light with proteins, blood cells, and pigments like melanin. To assess bias in algorithms caused by skin color, the traditional method has been to evaluate their performance on various skin tones using the Fitzpatrick scale, which consists of six options ranging from lightest to darkest. This scale was initially created by a dermatologist to gauge how skin responds to UV light. Last year, Google introduced the Monk scale, which received praise from AI researchers in the tech community for being more inclusive.
Sony’s researchers have conducted a study that reveals a more accurate method of representing the diverse range of skin tones using the international color standard CIELAB. This standard, commonly employed in photo editing and manufacturing, not only considers the depth of color (tone) but also the gradation (hue) when analyzing photos of individuals with different skin types. The findings will be presented at the International Conference on Computer Vision in Paris.
The inadequate representation of red and yellow hues in skin color scales seems to have contributed to the unnoticed bias in image algorithms. During their testing of open-source AI systems, such as Twitter’s image-cropper and image-generating algorithms, Sony researchers discovered a preference for redder skin tones. As a result, individuals with a more yellow-toned skin are not adequately represented in the final images produced by these algorithms. This could potentially disadvantage populations from East Asia, South Asia, Latin America, and the Middle East.
Sony’s researchers have put forward a novel approach for depicting skin color, acknowledging the overlooked diversity. Instead of using a single numerical value, their system utilizes two coordinates to describe the skin color in an image. These coordinates indicate the position on a spectrum ranging from light to dark, as well as the continuum between yellowness and redness, commonly referred to as warm to cool undertones in the cosmetics industry.
The new approach involves identifying the pixels in an image that represent skin, transforming the RGB color values of each pixel into CIELAB codes, and determining the average hue and tone within groups of skin pixels. A demonstration in the research illustrates how the skin tone of former US football star Terrell Owens and late actress Eva Gabor appear similar, but their hues differ, with Owens’ image being more red and Gabor’s image leaning towards yellow.
The Sony team discovered significant problems when they tested their method on online data and AI systems. They found that the CelebAMask-HQ data set, which is commonly used for training facial recognition and computer vision programs, had 82 percent of its images with a bias towards red skin tones. Similarly, the FFHQ data set, created by Nvidia, had a 66 percent inclination towards the red side. The researchers also observed that two generative AI models trained on FFHQ replicated this bias, as approximately 80 percent of the images generated by each model had a skew towards red hues.
It didn’t end there. AI programs ArcFace, FaceNet, and Dlib performed better on redder skin when asked to identify whether two portraits correspond to the same person, according to the Sony study. Davis King, the developer who created Dlib, says he’s not surprised by the skew because the model is trained mostly on US celebrity pictures.
Cloud AI tools from Microsoft Azure and Amazon Web Services to detect smiles also worked better on redder hues. Sarah Bird, who leads responsible AI engineering at Microsoft, says the company has been bolstering its investments in fairness and transparency. Amazon spokesperson Patrick Neighorn says, “We welcome collaboration with the research community, and we are carefully reviewing this study.” Nvidia declined to comment.